1
00:00:00,000 --> 00:00:00,120

2
00:00:00,120 --> 00:00:02,500
The following content is
provided under a Creative

3
00:00:02,500 --> 00:00:03,910
Commons license.

4
00:00:03,910 --> 00:00:06,950
Your support will help MIT
OpenCourseWare continue to

5
00:00:06,950 --> 00:00:10,600
offer high-quality educational
resources for free.

6
00:00:10,600 --> 00:00:13,500
To make a donation or view
additional materials from

7
00:00:13,500 --> 00:00:17,430
hundreds of MIT courses, visit
MIT OpenCourseWare at

8
00:00:17,430 --> 00:00:18,680
ocw.mit.edu.

9
00:00:18,680 --> 00:00:23,960

10
00:00:23,960 --> 00:00:25,840
CHARLES LEISERSON: Today we're
going to talk about dynamic

11
00:00:25,840 --> 00:00:31,070
storage allocation, which is a
hugely rich and interesting

12
00:00:31,070 --> 00:00:36,110
topic, and hopefully cover some
of the basics of dynamic

13
00:00:36,110 --> 00:00:38,090
storage allocation today.

14
00:00:38,090 --> 00:00:41,150
I want to start out with perhaps
the simplest storage

15
00:00:41,150 --> 00:00:45,110
allocation scheme, and that's
stack allocation, which I

16
00:00:45,110 --> 00:00:47,890
think you're probably fairly
familiar with.

17
00:00:47,890 --> 00:00:52,620
So stack allocation is just an
array and a pointer into the

18
00:00:52,620 --> 00:00:56,380
array, an index or a pointer
into the array.

19
00:00:56,380 --> 00:00:59,890
And when you have a stack in
a computer memory, you're

20
00:00:59,890 --> 00:01:02,900
talking about the array of all
memory, and the stack pointer

21
00:01:02,900 --> 00:01:06,540
pointing that's used, for
example, for calls.

22
00:01:06,540 --> 00:01:10,170
The reason stack allocation
is neat is

23
00:01:10,170 --> 00:01:12,210
because it's so simple.

24
00:01:12,210 --> 00:01:16,130
If I want to allocate x bytes,
then what I do, is I just

25
00:01:16,130 --> 00:01:24,030
simply increment my stack
pointer by x bytes, and then I

26
00:01:24,030 --> 00:01:28,470
return the original position of
the stack pointer as being

27
00:01:28,470 --> 00:01:30,705
the location of the storage
that's just been allocated.

28
00:01:30,705 --> 00:01:33,210

29
00:01:33,210 --> 00:01:37,320
So pretty simple.

30
00:01:37,320 --> 00:01:41,260
Typically if you write a stack
allocator, the routines are so

31
00:01:41,260 --> 00:01:44,500
simple they get inlined,
although it's of course

32
00:01:44,500 --> 00:01:48,290
advantageous to say that things
are inlined, but most

33
00:01:48,290 --> 00:01:50,460
compilers these days
will inline.

34
00:01:50,460 --> 00:01:53,530
And as you can see, it's just a
couple of operations here to

35
00:01:53,530 --> 00:01:55,934
perform this calculation.

36
00:01:55,934 --> 00:01:59,440
To deallocate the bytes,
we basically--

37
00:01:59,440 --> 00:02:01,430
oh, yeah, I should mention,
this isn't

38
00:02:01,430 --> 00:02:03,820
checking for overflow.

39
00:02:03,820 --> 00:02:05,686
So technically, one should
check for overflow.

40
00:02:05,686 --> 00:02:10,020

41
00:02:10,020 --> 00:02:13,880
One way to do that is for the
assert package, and that way,

42
00:02:13,880 --> 00:02:16,750
at run time, you don't
have to pay the

43
00:02:16,750 --> 00:02:17,930
extra overhead of checking.

44
00:02:17,930 --> 00:02:20,900
And see, these operations are
so cheap that actually

45
00:02:20,900 --> 00:02:26,420
checking for overflow is perhaps
as expensive as doing

46
00:02:26,420 --> 00:02:28,970
the rest of the operations
themselves.

47
00:02:28,970 --> 00:02:31,740
Checking to see whether you went
over the right end of the

48
00:02:31,740 --> 00:02:33,890
array here.

49
00:02:33,890 --> 00:02:38,820
So to free x bytes, you
basically just simply

50
00:02:38,820 --> 00:02:41,930
decrement the stack pointer.

51
00:02:41,930 --> 00:02:43,900
And of course, technically,
you should also check for

52
00:02:43,900 --> 00:02:49,120
underflow, make sure somebody
didn't say to deallocate more

53
00:02:49,120 --> 00:02:51,190
bytes than were actually
already

54
00:02:51,190 --> 00:02:52,440
allocated on the stack.

55
00:02:52,440 --> 00:02:55,030

56
00:02:55,030 --> 00:02:58,290
And so course, check for
stack underflow.

57
00:02:58,290 --> 00:03:01,790
So this is pretty good, because
allocating and freeing

58
00:03:01,790 --> 00:03:03,850
take just constant time.

59
00:03:03,850 --> 00:03:06,230
Very quick.

60
00:03:06,230 --> 00:03:10,990
But you're required to free
memory consistent with the

61
00:03:10,990 --> 00:03:17,110
stack discipline, which is the
last in, first out discipline.

62
00:03:17,110 --> 00:03:21,400
So you don't get to free
something that you allocated

63
00:03:21,400 --> 00:03:25,460
long ago until you've freed
everything else that's been

64
00:03:25,460 --> 00:03:27,960
allocated since.

65
00:03:27,960 --> 00:03:31,520
So therefore, it has limited
applicability, but it's great

66
00:03:31,520 --> 00:03:33,160
when it works.

67
00:03:33,160 --> 00:03:38,010
Now it turns out, sometimes you
can actually use the call

68
00:03:38,010 --> 00:03:46,070
stack yourself to store stack
values, using a function

69
00:03:46,070 --> 00:03:47,570
called alloca.

70
00:03:47,570 --> 00:03:50,780
But this function is now
deprecated, meaning that

71
00:03:50,780 --> 00:03:54,520
people advise you not to use it
in any modern programming.

72
00:03:54,520 --> 00:03:58,510
And part of the reason is that
the compiler is more efficient

73
00:03:58,510 --> 00:04:00,330
with fixed size stack frames.

74
00:04:00,330 --> 00:04:04,150
You remember when we went
through how the compiler

75
00:04:04,150 --> 00:04:08,660
actually lays out memory and
does the calling convention?

76
00:04:08,660 --> 00:04:12,590
If you have a fixed size stack
frame, then the frame pointer

77
00:04:12,590 --> 00:04:14,880
and the stack pointer are
redundant, and so you can

78
00:04:14,880 --> 00:04:18,930
reclaim a register, namely, the
frame pointer register,

79
00:04:18,930 --> 00:04:20,769
and do everything off
the stack pointer.

80
00:04:20,769 --> 00:04:23,550
But if you use things like
alloca, which are pushing the

81
00:04:23,550 --> 00:04:26,670
sides of the stack frame, then
you actually have to keep both

82
00:04:26,670 --> 00:04:29,270
pointers there, and that's a
little bit less efficient in

83
00:04:29,270 --> 00:04:31,480
terms of the register usage.

84
00:04:31,480 --> 00:04:31,740
OK?

85
00:04:31,740 --> 00:04:33,370
So that's pretty much
stack storage.

86
00:04:33,370 --> 00:04:35,780
Any questions about stacks?

87
00:04:35,780 --> 00:04:37,210
Yeah?

88
00:04:37,210 --> 00:04:39,347
AUDIENCE: Is there anything
that detects overflow or

89
00:04:39,347 --> 00:04:41,010
underflow without
using inserts?

90
00:04:41,010 --> 00:04:41,960
Like, if you wanted to check--

91
00:04:41,960 --> 00:04:43,385
CHARLES LEISERSON: Yeah, you
couldn't check explicitly.

92
00:04:43,385 --> 00:04:46,840
Just say, you know, for this, is
before you check that, make

93
00:04:46,840 --> 00:04:51,451
sure that SP plus x, is it
bigger than the right end?

94
00:04:51,451 --> 00:04:53,880
AUDIENCE: But how do you know
where the right end is?

95
00:04:53,880 --> 00:04:55,510
CHARLES LEISERSON: Well, because
I'm assuming here that

96
00:04:55,510 --> 00:04:57,920
in this case, you are
allocating it.

97
00:04:57,920 --> 00:04:59,730
Now, if you're talking
about using the

98
00:04:59,730 --> 00:05:01,650
call stack, you mean?

99
00:05:01,650 --> 00:05:02,840
You'll never run out.

100
00:05:02,840 --> 00:05:04,620
We'll do that analysis
in a little bit.

101
00:05:04,620 --> 00:05:08,470
But you'll never run
out of stack space.

102
00:05:08,470 --> 00:05:12,670
Although as a practical matter,
most compilers assume

103
00:05:12,670 --> 00:05:21,420
a stack of a megabyte, and so
if you overflow that, your

104
00:05:21,420 --> 00:05:24,140
program will fail in any
of a variety of ways.

105
00:05:24,140 --> 00:05:25,530
AUDIENCE: You can
do guard pages.

106
00:05:25,530 --> 00:05:27,460
CHARLES LEISERSON: You can do
guard pages, which don't

107
00:05:27,460 --> 00:05:29,260
always work.

108
00:05:29,260 --> 00:05:31,800
But let me explain
guard pages, OK?

109
00:05:31,800 --> 00:05:39,790
So what's sometimes done is,
if I have my stack growing

110
00:05:39,790 --> 00:05:41,450
down in memory--

111
00:05:41,450 --> 00:05:45,720
OK, so here's my stack, and
it's growing this way--

112
00:05:45,720 --> 00:05:47,750
and I want to make sure, for
example, the thing that's

113
00:05:47,750 --> 00:05:51,200
typically growing up this way
is heat, and I want to make

114
00:05:51,200 --> 00:05:54,290
sure that they don't collide
somewhere in here.

115
00:05:54,290 --> 00:05:57,000
So one thing I can do
is put some pages in

116
00:05:57,000 --> 00:06:00,480
here that are protected.

117
00:06:00,480 --> 00:06:03,430
These are read-only pages.

118
00:06:03,430 --> 00:06:06,335
Or actually even better
is no permissions.

119
00:06:06,335 --> 00:06:09,450

120
00:06:09,450 --> 00:06:12,963
So I can put no permissions so
that if you ever try to read

121
00:06:12,963 --> 00:06:14,400
or write that page, it

122
00:06:14,400 --> 00:06:17,600
automatically generates a fault.

123
00:06:17,600 --> 00:06:19,980
And so the runtime will
sometimes do that.

124
00:06:19,980 --> 00:06:22,140
The difficulty is, what happens
if you increased your

125
00:06:22,140 --> 00:06:25,590
stack so much that you went
beyond your guard pages?

126
00:06:25,590 --> 00:06:28,590
But as a practical matter most
of the times that people

127
00:06:28,590 --> 00:06:34,920
exceed the guard pages is
only by small amount.

128
00:06:34,920 --> 00:06:37,530
And so just putting a few guard
pages in there will make

129
00:06:37,530 --> 00:06:43,870
sure that you can use the memory
management hardware,

130
00:06:43,870 --> 00:06:47,360
which basically ends up being
free from your point of view

131
00:06:47,360 --> 00:06:50,180
as a programmer, in
order to catch a

132
00:06:50,180 --> 00:06:52,490
stack overflow problem.

133
00:06:52,490 --> 00:06:53,770
And you can do that yourself.

134
00:06:53,770 --> 00:07:01,920
You can use the call mmap, which
we haven't talked about,

135
00:07:01,920 --> 00:07:07,860
to set permissions on
pages at user level.

136
00:07:07,860 --> 00:07:10,660
AUDIENCE: [INAUDIBLE]

137
00:07:10,660 --> 00:07:10,840
CHARLES LEISERSON: No.

138
00:07:10,840 --> 00:07:15,090
You can actually handle the
error however you want.

139
00:07:15,090 --> 00:07:17,610
You can vector into a user
to find routine.

140
00:07:17,610 --> 00:07:21,780
Now, what are you going to do,
I don't know, but you can

141
00:07:21,780 --> 00:07:28,480
indeed catch the error yourself
and decide that

142
00:07:28,480 --> 00:07:32,260
there's some way of recovering,
as opposed to

143
00:07:32,260 --> 00:07:33,540
crashing and burning.

144
00:07:33,540 --> 00:07:38,320
So generally, it's considered
poor form for your software to

145
00:07:38,320 --> 00:07:42,130
fail, other than in
a controlled way.

146
00:07:42,130 --> 00:07:45,370
So you should always have your
software exiting or calling

147
00:07:45,370 --> 00:07:50,310
error, and it should never
segment fault.

148
00:07:50,310 --> 00:07:54,260
Most people consider any program
that segfaults to be

149
00:07:54,260 --> 00:07:57,420
poorly engineered.

150
00:07:57,420 --> 00:07:59,590
Is there any case where there's
a segfaulting program

151
00:07:59,590 --> 00:08:02,700
that is not poorly engineered?

152
00:08:02,700 --> 00:08:03,400
I don't know.

153
00:08:03,400 --> 00:08:08,450
I think most people, it's
probably an understatement.

154
00:08:08,450 --> 00:08:09,700
OK?

155
00:08:09,700 --> 00:08:11,580

156
00:08:11,580 --> 00:08:12,660
So this is stack memory.

157
00:08:12,660 --> 00:08:18,150
You can build it yourself out of
an array and a pointer, or

158
00:08:18,150 --> 00:08:21,820
you can I use this
the call stack.

159
00:08:21,820 --> 00:08:25,920
And when you have something
that's using that kind of

160
00:08:25,920 --> 00:08:27,770
discipline, it's great, because
it's really cheap.

161
00:08:27,770 --> 00:08:31,040

162
00:08:31,040 --> 00:08:36,200
Now in C and C++, and we're
going to start moving and

163
00:08:36,200 --> 00:08:40,690
talking more about things like
C++, we have what's called

164
00:08:40,690 --> 00:08:42,970
heap allocation.

165
00:08:42,970 --> 00:08:47,780
And in C, the routines are
called malloc and free, and

166
00:08:47,780 --> 00:08:52,750
C++ provides a new directive
and delete.

167
00:08:52,750 --> 00:08:57,610
So unlike Java and
Python, C and C++

168
00:08:57,610 --> 00:09:01,050
provide no garbage collector.

169
00:09:01,050 --> 00:09:05,190
So in other words, you're
responsible as a programmer

170
00:09:05,190 --> 00:09:08,490
for freeing allocated
storage yourself.

171
00:09:08,490 --> 00:09:13,150
If you fail to do so, you
create a memory leak.

172
00:09:13,150 --> 00:09:16,130
So a memory leak means I
allocated something, I no

173
00:09:16,130 --> 00:09:19,650
longer am in a position to use
it, but I failed to free it,

174
00:09:19,650 --> 00:09:22,530
so it can't be reused.

175
00:09:22,530 --> 00:09:24,650
And if you keep doing that,
especially if you're in a

176
00:09:24,650 --> 00:09:27,080
loop, then you have
a memory leak.

177
00:09:27,080 --> 00:09:30,120
Where if you look at a program
with a memory leak, and you

178
00:09:30,120 --> 00:09:33,240
look using [? Top ?] or
whatever, how big is my

179
00:09:33,240 --> 00:09:36,600
program, you watch as it goes
and computes and computes, and

180
00:09:36,600 --> 00:09:38,960
it keeps getting bigger and
bigger and bigger and bigger

181
00:09:38,960 --> 00:09:41,700
as it's running, and it just
runs away, getting more and

182
00:09:41,700 --> 00:09:44,390
more storage.

183
00:09:44,390 --> 00:09:48,590
Also you have to watch out for
things like dangling pointers.

184
00:09:48,590 --> 00:09:51,490
That is, you deallocated
something, but somebody still

185
00:09:51,490 --> 00:09:54,590
wanted to use it.

186
00:09:54,590 --> 00:09:57,750
Or double-freeing.

187
00:09:57,750 --> 00:09:59,410
You have two places in
the code where you

188
00:09:59,410 --> 00:10:02,720
free the same storage.

189
00:10:02,720 --> 00:10:07,310
Very bad to try to free
the same storage.

190
00:10:07,310 --> 00:10:08,826
Yeah?

191
00:10:08,826 --> 00:10:10,076
AUDIENCE: [INAUDIBLE]

192
00:10:10,076 --> 00:10:12,040

193
00:10:12,040 --> 00:10:14,860
CHARLES LEISERSON: It's
basically going to be sitting

194
00:10:14,860 --> 00:10:18,360
in the storage structure, which
we'll talk about how

195
00:10:18,360 --> 00:10:20,515
those are organized
in a few minutes--

196
00:10:20,515 --> 00:10:22,970
going to be sitting in the
storage structure, but the

197
00:10:22,970 --> 00:10:26,570
program that is going to look at
that and not recognize that

198
00:10:26,570 --> 00:10:28,930
it's in the storage structure,
and think that it's something

199
00:10:28,930 --> 00:10:31,370
that now all its pointers
can be reset.

200
00:10:31,370 --> 00:10:33,950
And so you basically clobber
all the invariants of the

201
00:10:33,950 --> 00:10:36,660
storage structure, and somewhere
down the road, you

202
00:10:36,660 --> 00:10:38,950
get a segment fault
or something.

203
00:10:38,950 --> 00:10:39,720
OK?

204
00:10:39,720 --> 00:10:40,762
Yeah, John?

205
00:10:40,762 --> 00:10:42,238
AUDIENCE: [UNINTELLIGIBLE]

206
00:10:42,238 --> 00:10:43,714
that comes out of that is--

207
00:10:43,714 --> 00:10:44,206
CHARLES LEISERSON:
Just a second.

208
00:10:44,206 --> 00:10:45,456
Why don't you take this?

209
00:10:45,456 --> 00:10:48,634

210
00:10:48,634 --> 00:10:49,884
AUDIENCE: [INAUDIBLE]

211
00:10:49,884 --> 00:10:57,500

212
00:10:57,500 --> 00:11:00,400
One thing that often happens in
the real world is when you

213
00:11:00,400 --> 00:11:04,800
free something, it adds it to
a free, list and if you free

214
00:11:04,800 --> 00:11:07,250
something twice, it will add two
copies of that pointer to

215
00:11:07,250 --> 00:11:08,130
the free list.

216
00:11:08,130 --> 00:11:10,970
So later on, when you malloc,
malloc might return the same

217
00:11:10,970 --> 00:11:12,400
pointer to you twice.

218
00:11:12,400 --> 00:11:15,990
So the same memory block might
be allocated for two

219
00:11:15,990 --> 00:11:19,330
completely different
functionalities, and attackers

220
00:11:19,330 --> 00:11:21,890
use that to do all sorts
of interesting

221
00:11:21,890 --> 00:11:22,770
things to your program.

222
00:11:22,770 --> 00:11:24,108
CHARLES LEISERSON: Yeah.

223
00:11:24,108 --> 00:11:26,000
OK?

224
00:11:26,000 --> 00:11:29,930
So fortunately, there are some
really good-- why don't you

225
00:11:29,930 --> 00:11:33,730
keep that down there, and then
if you guys want to chime in,

226
00:11:33,730 --> 00:11:39,660
you'll have some
power to do so.

227
00:11:39,660 --> 00:11:44,720
So fortunately, there are now
some tools to make that easier

228
00:11:44,720 --> 00:11:48,260
to debug these kinds of
really nasty bugs.

229
00:11:48,260 --> 00:11:52,080
The one we'll be using is
Valgrind, and what you do is

230
00:11:52,080 --> 00:11:55,820
you just say, Valgrind minus
minus leak check equals yes,

231
00:11:55,820 --> 00:11:58,950
and then run your program the
way you would normally run it.

232
00:11:58,950 --> 00:12:03,270
And what it will do is, it will
monitor the way that the

233
00:12:03,270 --> 00:12:09,850
program is doing allocations
and frees, and give you a

234
00:12:09,850 --> 00:12:12,490
report as to whether everything
that was allocated

235
00:12:12,490 --> 00:12:16,030
was free, and whether you tried
to free something twice,

236
00:12:16,030 --> 00:12:17,820
and so forth.

237
00:12:17,820 --> 00:12:21,830
Now, this works with the
existing allocators.

238
00:12:21,830 --> 00:12:24,980
If you have your own allocator
that you decide you're going

239
00:12:24,980 --> 00:12:27,400
to write for a little piece of
your code, you say, gee, I'm

240
00:12:27,400 --> 00:12:30,210
going to do my own little
allocation here, you can

241
00:12:30,210 --> 00:12:34,280
actually communicate to Valgrind
what that allocator

242
00:12:34,280 --> 00:12:36,870
is, and it can check for
proper use of that

243
00:12:36,870 --> 00:12:38,670
structure as well.

244
00:12:38,670 --> 00:12:41,970
So you can see valgrind.org
for details.

245
00:12:41,970 --> 00:12:47,360
And are a lot of other programs
on the market, and

246
00:12:47,360 --> 00:12:51,450
out there as open software,
such as purify and

247
00:12:51,450 --> 00:12:52,700
a variety of others.

248
00:12:52,700 --> 00:12:55,110

249
00:12:55,110 --> 00:12:59,100
So let's talk about
heap allocation.

250
00:12:59,100 --> 00:13:01,770
I want to start by talking about
fixed size allocation.

251
00:13:01,770 --> 00:13:05,400
So suppose the world has only
one size of objects that you

252
00:13:05,400 --> 00:13:07,760
care about.

253
00:13:07,760 --> 00:13:12,710
Then it turns out that doing an
allocator is really simple,

254
00:13:12,710 --> 00:13:15,640
and you could program
one yourself.

255
00:13:15,640 --> 00:13:17,200
So here's basically
how it works.

256
00:13:17,200 --> 00:13:19,275
It's called using a free list.

257
00:13:19,275 --> 00:13:25,830
And what you have is, in your
array, say, that you're using

258
00:13:25,830 --> 00:13:31,500
for storing all of your values,
and here they're

259
00:13:31,500 --> 00:13:35,260
basically this size, what you
do is you have a certain

260
00:13:35,260 --> 00:13:36,400
number that are used.

261
00:13:36,400 --> 00:13:39,560
That means that the program
is using them,

262
00:13:39,560 --> 00:13:41,165
not the storage allocator.

263
00:13:41,165 --> 00:13:44,520
And what you do is you take
advantage of the fact that if

264
00:13:44,520 --> 00:13:48,440
it's an unused block of storage,
then you can put a

265
00:13:48,440 --> 00:13:50,710
pointer in there.

266
00:13:50,710 --> 00:13:54,290
And so what you do, is you
string all these unblocked

267
00:13:54,290 --> 00:13:56,810
pieces of storage together
into a free list.

268
00:13:56,810 --> 00:13:59,960
So we point to there, which
points to there, which points

269
00:13:59,960 --> 00:14:04,180
to there, which points to there,
which points nowhere.

270
00:14:04,180 --> 00:14:06,530
So you just simply linked
together all of the things

271
00:14:06,530 --> 00:14:09,540
that are free in that list.

272
00:14:09,540 --> 00:14:13,350
And then whenever you need one
to allocate an object, what

273
00:14:13,350 --> 00:14:17,400
you do is, you grab one
off the free list.

274
00:14:17,400 --> 00:14:23,290
So you set the value that you're
going to want to take

275
00:14:23,290 --> 00:14:25,040
off free list to free--

276
00:14:25,040 --> 00:14:28,540
so here x gets a pointer to this
object, the one that's at

277
00:14:28,540 --> 00:14:30,740
the start of the list.

278
00:14:30,740 --> 00:14:37,690
Then you update free with
free arrow next, which

279
00:14:37,690 --> 00:14:38,940
gives you this guy.

280
00:14:38,940 --> 00:14:41,170

281
00:14:41,170 --> 00:14:43,180
So that's the next
guy in line.

282
00:14:43,180 --> 00:14:46,850
And then you basically
return x.

283
00:14:46,850 --> 00:14:48,920
And that's all there is to it.

284
00:14:48,920 --> 00:14:53,550
You just grab the first element
off the free list.

285
00:14:53,550 --> 00:14:56,070
Now, there's a little bit of
subtlety in that the thing

286
00:14:56,070 --> 00:14:59,900
that you're grabbing off has a
garbage pointer in it, because

287
00:14:59,900 --> 00:15:01,730
we're not initializing
storage.

288
00:15:01,730 --> 00:15:03,970
There are some storage
allocators that will always

289
00:15:03,970 --> 00:15:05,980
set the value that's returned
to the user to

290
00:15:05,980 --> 00:15:08,210
all zeros, for example.

291
00:15:08,210 --> 00:15:11,890
But typically malloc
doesn't do that.

292
00:15:11,890 --> 00:15:15,590
It basically leaves that as
uninitialized storage.

293
00:15:15,590 --> 00:15:17,970
And so you can actually go and
see, what were the values that

294
00:15:17,970 --> 00:15:21,030
used to be stored in there?

295
00:15:21,030 --> 00:15:25,010
And so basically, you can end
up with a garbage pointer

296
00:15:25,010 --> 00:15:28,170
there, but you really don't care
about that, because it's

297
00:15:28,170 --> 00:15:29,850
no longer on the list.

298
00:15:29,850 --> 00:15:32,750
So you can see, for example,
if I then went and I--

299
00:15:32,750 --> 00:15:36,060

300
00:15:36,060 --> 00:15:40,190
well, let's do this, to take a
look at what happens when a

301
00:15:40,190 --> 00:15:41,490
free an item.

302
00:15:41,490 --> 00:15:46,790
So if I want to free an object
x, I take some object x here.

303
00:15:46,790 --> 00:15:49,020
Let's say it's this guy
that's being freed.

304
00:15:49,020 --> 00:15:57,620
What I do is, I make it
point to the first

305
00:15:57,620 --> 00:16:00,820
element of the list.

306
00:16:00,820 --> 00:16:04,050
So basically, next equals
free, So he

307
00:16:04,050 --> 00:16:05,680
basically points there.

308
00:16:05,680 --> 00:16:09,560
And then I say free equals x,
meaning move the free pointer

309
00:16:09,560 --> 00:16:13,510
up to now point to the new
beginning of the list.

310
00:16:13,510 --> 00:16:15,830
So we're basically pushing and
popping things off the front

311
00:16:15,830 --> 00:16:16,250
of the list.

312
00:16:16,250 --> 00:16:18,640
It's just the stack discipline,
except we're using

313
00:16:18,640 --> 00:16:23,060
a linked list, rather
than using an array.

314
00:16:23,060 --> 00:16:33,880
So we push to free, and
we pop to allocate.

315
00:16:33,880 --> 00:16:35,140
Now the tricky thing here--

316
00:16:35,140 --> 00:16:38,390
you can sort of see what might
go wrong if I then said, oh,

317
00:16:38,390 --> 00:16:41,910
let me free this thing again.

318
00:16:41,910 --> 00:16:44,860
Suppose you said, now that this
is free, suppose you came

319
00:16:44,860 --> 00:16:47,540
in and say, oh, I'll free
that thing again.

320
00:16:47,540 --> 00:16:51,500
Well, now you're going to go
through the same code here,

321
00:16:51,500 --> 00:16:53,100
and you're going to see, you're
going to completely

322
00:16:53,100 --> 00:16:55,070
screw up the free list.

323
00:16:55,070 --> 00:16:59,000
So that's why you don't want
to do a double free.

324
00:16:59,000 --> 00:17:01,150
So you can see exactly what's
going to go on in the free

325
00:17:01,150 --> 00:17:03,050
list implementation.

326
00:17:03,050 --> 00:17:04,780
So this is pretty good
and pretty cheap.

327
00:17:04,780 --> 00:17:07,780

328
00:17:07,780 --> 00:17:10,069
Allocating and freeing
take constant time.

329
00:17:10,069 --> 00:17:12,740
That's great, just as
with stack pointer.

330
00:17:12,740 --> 00:17:15,240
It's got good temporal locality,
and the reason is

331
00:17:15,240 --> 00:17:18,650
because you're tending to
allocate the thing that you

332
00:17:18,650 --> 00:17:20,319
freed most recently.

333
00:17:20,319 --> 00:17:22,560
The free list operates
as a last in,

334
00:17:22,560 --> 00:17:24,930
first out data structure.

335
00:17:24,930 --> 00:17:26,829
So from a temporal locality
point of view,

336
00:17:26,829 --> 00:17:28,670
it's really very good.

337
00:17:28,670 --> 00:17:32,550
You're always using the most
recently freed stuff whenever

338
00:17:32,550 --> 00:17:36,740
you allocate, and that means
you have very good caching

339
00:17:36,740 --> 00:17:39,450
behavior from a temporal
locality point of view.

340
00:17:39,450 --> 00:17:44,200
However, it has poor spatial
locality, and the reason is

341
00:17:44,200 --> 00:17:49,760
due to what's called external
fragmentation, where you get

342
00:17:49,760 --> 00:17:54,250
your storage distributed
across virtual memory.

343
00:17:54,250 --> 00:17:56,750
So if you think about this, as
you start allocating and

344
00:17:56,750 --> 00:17:59,570
freeing in different orders,
the order of that free list

345
00:17:59,570 --> 00:18:02,450
gets very jumbled up.

346
00:18:02,450 --> 00:18:09,720
And as it increases, and you end
up with jumping from one

347
00:18:09,720 --> 00:18:17,750
thing to another, you basically
can end up causing

348
00:18:17,750 --> 00:18:21,740
you to use a lot of the page
table, because you're

349
00:18:21,740 --> 00:18:25,090
basically using a huge piece
of it, rather than-- and in

350
00:18:25,090 --> 00:18:28,400
particular, the translation
look aside buffer

351
00:18:28,400 --> 00:18:30,430
can also be a problem.

352
00:18:30,430 --> 00:18:34,770
So the translation look aside
buffer typically works on a

353
00:18:34,770 --> 00:18:36,140
page basis.

354
00:18:36,140 --> 00:18:39,630
And if you recently referenced
things on a given page, that's

355
00:18:39,630 --> 00:18:42,340
very good for the TLB, because
it's already got the

356
00:18:42,340 --> 00:18:46,360
translation, the virtual address
mapping of that page

357
00:18:46,360 --> 00:18:48,120
dealt with.

358
00:18:48,120 --> 00:18:51,310
So it basically has poor spatial
locality, because it's

359
00:18:51,310 --> 00:18:52,610
getting all jumbled up.

360
00:18:52,610 --> 00:18:55,920
Compared to stack allocation,
which is always very nice in

361
00:18:55,920 --> 00:18:58,240
terms of its behavior, because
you're going across

362
00:18:58,240 --> 00:19:01,980
consecutive locations in memory,
which means you're

363
00:19:01,980 --> 00:19:05,960
always allocating and freeing
from the same page, unless you

364
00:19:05,960 --> 00:19:09,170
trip over a page boundary.

365
00:19:09,170 --> 00:19:13,180
So here's some ways of
mitigating external

366
00:19:13,180 --> 00:19:14,020
fragmentation.

367
00:19:14,020 --> 00:19:17,940
Of course, this makes the
code for free list more

368
00:19:17,940 --> 00:19:19,230
complicated.

369
00:19:19,230 --> 00:19:23,500
And then you have to weigh, how
much am I losing in having

370
00:19:23,500 --> 00:19:27,160
a more complicated allocator
versus suffering the

371
00:19:27,160 --> 00:19:28,840
fragmentation may occur?

372
00:19:28,840 --> 00:19:29,943
Yes, question?

373
00:19:29,943 --> 00:19:32,358
AUDIENCE: You specificed before
that it was fixed size.

374
00:19:32,358 --> 00:19:35,260
Is that why we don't specify
anything about the size?

375
00:19:35,260 --> 00:19:35,990
CHARLES LEISERSON:
That's right.

376
00:19:35,990 --> 00:19:38,780
We're just assuming that it's
all whatever the unit size is.

377
00:19:38,780 --> 00:19:42,450
So this is really good when
you've created your own data

378
00:19:42,450 --> 00:19:46,400
structure, and let's say you're
doing nodes in a graph.

379
00:19:46,400 --> 00:19:50,490
You know exactly what elements
you want in the graph, for

380
00:19:50,490 --> 00:19:52,200
each vertex.

381
00:19:52,200 --> 00:19:56,240
And now what you want to do is
basically be able to give

382
00:19:56,240 --> 00:19:59,110
yourself a note of a graph, free
a note of a graph, and

383
00:19:59,110 --> 00:20:00,365
manage that storage yourself.

384
00:20:00,365 --> 00:20:02,970

385
00:20:02,970 --> 00:20:04,067
Yes, question?

386
00:20:04,067 --> 00:20:05,061
AUDIENCE: [INAUDIBLE]

387
00:20:05,061 --> 00:20:08,043
allocated across virtual
memory [INAUDIBLE] one

388
00:20:08,043 --> 00:20:11,030
application?

389
00:20:11,030 --> 00:20:13,040
CHARLES LEISERSON: Why is it
allocated across virtual

390
00:20:13,040 --> 00:20:15,060
memory if it's all
one application?

391
00:20:15,060 --> 00:20:19,260
Well, because the size that you
may end up needing can end

392
00:20:19,260 --> 00:20:19,990
up being large.

393
00:20:19,990 --> 00:20:22,010
And I'll show you in just a
minute how that happens.

394
00:20:22,010 --> 00:20:24,880

395
00:20:24,880 --> 00:20:27,920
So here's the idea.

396
00:20:27,920 --> 00:20:31,990
What you can do is keep a
free list per disk page.

397
00:20:31,990 --> 00:20:37,030
So disk page, a traditional
page is 4,000 bytes.

398
00:20:37,030 --> 00:20:39,150
These days they talk about
super pages that are

399
00:20:39,150 --> 00:20:45,120
megabytes, and who knows what
size pages various operating

400
00:20:45,120 --> 00:20:46,020
systems will support.

401
00:20:46,020 --> 00:20:50,450
I believe the clouds use 4k,
as configured right now are

402
00:20:50,450 --> 00:20:54,260
using 4k, which leads to some
strange anomalies where the

403
00:20:54,260 --> 00:21:00,704
TLB can address less stuff
than the L3 cache.

404
00:21:00,704 --> 00:21:02,250
But so be it.

405
00:21:02,250 --> 00:21:05,630
So basically, they just keep a
free list per disk page, and

406
00:21:05,630 --> 00:21:09,320
basically always allocate from
the free list for the page

407
00:21:09,320 --> 00:21:11,730
that's fullest, the page that
you've allocated the most of

408
00:21:11,730 --> 00:21:13,880
the stuff off of.

409
00:21:13,880 --> 00:21:17,390
Rather than for a page
that's less empty.

410
00:21:17,390 --> 00:21:19,990
So what this means is that when
you come to allocate, you

411
00:21:19,990 --> 00:21:22,120
have to figure out, which
page should I do the

412
00:21:22,120 --> 00:21:23,670
allocation off of?

413
00:21:23,670 --> 00:21:26,490
But we'll have a free
list on each page.

414
00:21:26,490 --> 00:21:28,530
Whenever you free of a block of
storage, you have to free

415
00:21:28,530 --> 00:21:30,460
it to the page on which
the block resides.

416
00:21:30,460 --> 00:21:33,040
Don't put into a free list
of a different page.

417
00:21:33,040 --> 00:21:36,430
You put it into a free list all
of that particular page.

418
00:21:36,430 --> 00:21:40,020
Now if a page becomes empty, in
other words, there's only

419
00:21:40,020 --> 00:21:43,180
the free list on it, there's no
actually used elements by

420
00:21:43,180 --> 00:21:46,290
the programmer, the virtual
memory system

421
00:21:46,290 --> 00:21:47,550
will page it out.

422
00:21:47,550 --> 00:21:50,220
And since the programmer's never
referencing things on

423
00:21:50,220 --> 00:21:53,400
that page, it basically
costs you nothing.

424
00:21:53,400 --> 00:21:59,570
You end up not having to have
an entry in the TLB, and you

425
00:21:59,570 --> 00:22:02,890
may have written to it, but
you will end up allocating

426
00:22:02,890 --> 00:22:07,770
things preferentially to
the more full pages.

427
00:22:07,770 --> 00:22:11,980
And the basic idea is, it's
better to have the use of your

428
00:22:11,980 --> 00:22:15,720
pages balance 90/10
than 50/50.

429
00:22:15,720 --> 00:22:17,320
So this kind [? of fig ?]
configuration

430
00:22:17,320 --> 00:22:20,160
is better than that.

431
00:22:20,160 --> 00:22:22,590
So why is that?

432
00:22:22,590 --> 00:22:27,450
If you have 90% of your storage
on one page and only

433
00:22:27,450 --> 00:22:30,940
10% on another--

434
00:22:30,940 --> 00:22:34,850
yeah, let's say we only of two
pages in our system, and 90%

435
00:22:34,850 --> 00:22:39,210
of the objects are on one page,
and 10% of the objects

436
00:22:39,210 --> 00:22:42,210
are on the other page, versus
if they're 50-50.

437
00:22:42,210 --> 00:22:46,860
Then what you can do is look
at the probability that two

438
00:22:46,860 --> 00:22:51,050
accesses are going to
go to the same page.

439
00:22:51,050 --> 00:22:54,870
Because if you're going to the
same page, then your paging

440
00:22:54,870 --> 00:22:58,060
hardware, which is very much
like the cache, your TLB,

441
00:22:58,060 --> 00:23:00,330
which is very much
like a cache--

442
00:23:00,330 --> 00:23:03,010
whenever you get a hit on the
same page, that's good, it

443
00:23:03,010 --> 00:23:04,190
costs you nothing.

444
00:23:04,190 --> 00:23:06,760
If you hit on a different
page, that's bad.

445
00:23:06,760 --> 00:23:08,910
So what happens is, the
probability that two hit the

446
00:23:08,910 --> 00:23:11,830
same page is, well, for the
first one, it's, what's the

447
00:23:11,830 --> 00:23:14,930
probability that the first
one was on this page,

448
00:23:14,930 --> 00:23:16,900
versus on the other?

449
00:23:16,900 --> 00:23:19,580
So it's going to be on the same
page if they're both on

450
00:23:19,580 --> 00:23:22,230
this page or they're
both on this page.

451
00:23:22,230 --> 00:23:27,330
And so that's going to be 0.9
times 0.9 plus 0.1 times 0.1.

452
00:23:27,330 --> 00:23:31,170
Whereas if it's 50/50, then the
probability that you get

453
00:23:31,170 --> 00:23:36,820
hit on the same page is now this
0.5 times 0.5 plus 0.5

454
00:23:36,820 --> 00:23:39,460
times 0.5, which is only 50%.

455
00:23:39,460 --> 00:23:45,520
So we get 82% hit rate versus
the 50% hit rate.

456
00:23:45,520 --> 00:23:48,200
So it's much more likely, if I'm
stacking everything out in

457
00:23:48,200 --> 00:23:51,140
the extreme, it's better to have
everything on one page

458
00:23:51,140 --> 00:23:52,890
and nothing on the other.

459
00:23:52,890 --> 00:23:55,640
Then I get 100% hit rate,
because whenever I'm accessing

460
00:23:55,640 --> 00:24:00,750
something, I always am hitting
on the page I had before.

461
00:24:00,750 --> 00:24:03,770
So this kind of heuristic
helps it so that you're

462
00:24:03,770 --> 00:24:09,940
swinging your accesses to using
fewer pages, and the

463
00:24:09,940 --> 00:24:13,300
fewer pages, the less likely
it is that you're going to

464
00:24:13,300 --> 00:24:16,860
stress your virtual memory
paging, disk swapping, you're

465
00:24:16,860 --> 00:24:18,840
going to stress your TLB
and what have you.

466
00:24:18,840 --> 00:24:21,770

467
00:24:21,770 --> 00:24:23,660
Good for that?

468
00:24:23,660 --> 00:24:25,240
We're going to get deeper
into this, so.

469
00:24:25,240 --> 00:24:30,130

470
00:24:30,130 --> 00:24:33,270
So that's basically fixed
size allocation.

471
00:24:33,270 --> 00:24:35,230
It's really nice, because you
can basically use a very

472
00:24:35,230 --> 00:24:37,220
simple free list.

473
00:24:37,220 --> 00:24:40,350
But of course what most people
need and want is variable

474
00:24:40,350 --> 00:24:42,240
sized allocation.

475
00:24:42,240 --> 00:24:46,830
The most popular way of doing
variable sized allocation is a

476
00:24:46,830 --> 00:24:50,020
scheme called binned free lists,
and of that, then there

477
00:24:50,020 --> 00:24:51,680
are a zillion variations
on it.

478
00:24:51,680 --> 00:24:55,180
But mostly, people use
binned free lists.

479
00:24:55,180 --> 00:24:58,330
The idea is, we're going to
leverage the efficiency of

480
00:24:58,330 --> 00:24:59,880
normal free lists.

481
00:24:59,880 --> 00:25:02,800
Normal free lists I can allocate
and free in just one

482
00:25:02,800 --> 00:25:07,370
operation, in just a constant
amount of time I make.

483
00:25:07,370 --> 00:25:09,700
And what we're going to do
is accept some internal

484
00:25:09,700 --> 00:25:11,220
fragmentation.

485
00:25:11,220 --> 00:25:14,080
So by internal fragmentation
I mean the following.

486
00:25:14,080 --> 00:25:19,130
When somebody asks for something
of a given size,

487
00:25:19,130 --> 00:25:23,200
what we're going to do is
actually give them something

488
00:25:23,200 --> 00:25:27,350
which is a little bit
bigger, maybe.

489
00:25:27,350 --> 00:25:28,815
And so we'll waste some.

490
00:25:28,815 --> 00:25:32,460
And in particular, what we'll
do is, we'll organize our

491
00:25:32,460 --> 00:25:36,550
system so that all the blocks
that we ever give out are

492
00:25:36,550 --> 00:25:39,250
always a size the
power of two.

493
00:25:39,250 --> 00:25:41,980
In practice, you don't always
give out exactly a power of

494
00:25:41,980 --> 00:25:44,260
two, because then you get cache
alignment problems.

495
00:25:44,260 --> 00:25:49,090
But this is the basic scheme,
is to give out things are

496
00:25:49,090 --> 00:25:50,880
powers of two.

497
00:25:50,880 --> 00:25:53,930
And that way, if somebody asks
for something that has 13

498
00:25:53,930 --> 00:25:57,270
bytes, you give them 16 bytes.

499
00:25:57,270 --> 00:26:00,780
For somebody who asked
for 65 bytes, have to

500
00:26:00,780 --> 00:26:03,570
give them 128 bytes.

501
00:26:03,570 --> 00:26:07,630
So you might waste up to as
much as a factor of two.

502
00:26:07,630 --> 00:26:11,590
But now, for a given that range
of values, there's only

503
00:26:11,590 --> 00:26:15,000
a logarithmic number of
different sizes that we have

504
00:26:15,000 --> 00:26:16,770
to give out, because we're
only giving them out

505
00:26:16,770 --> 00:26:19,910
in powers of two.

506
00:26:19,910 --> 00:26:27,430
So here's the basic allocation
scheme to allocate x bytes.

507
00:26:27,430 --> 00:26:32,660
So what I do is, if I'm
allocating x bytes, what I do

508
00:26:32,660 --> 00:26:38,870
is, I look in the bin ceiling of
log x, because that's where

509
00:26:38,870 --> 00:26:42,650
everything of size two to the
ceiling of log x is, which is

510
00:26:42,650 --> 00:26:46,890
basically rounding x up to
the next power of two.

511
00:26:46,890 --> 00:26:51,750
If it's non-empty, I just return
a block as if it were

512
00:26:51,750 --> 00:26:54,030
an ordinary free list.

513
00:26:54,030 --> 00:26:56,350
And what I'll be doing is
returning a block which is at

514
00:26:56,350 --> 00:26:59,380
most twice the size of
the requested block.

515
00:26:59,380 --> 00:27:01,040
But that's generally OK.

516
00:27:01,040 --> 00:27:03,940
Somebody wants at least
that many bytes.

517
00:27:03,940 --> 00:27:06,240
The fact that it's a few more
bytes won't matter to them.

518
00:27:06,240 --> 00:27:08,800
They don't know that there's
more bytes hidden there.

519
00:27:08,800 --> 00:27:11,720
They only take advantage of the
fact that there are many

520
00:27:11,720 --> 00:27:15,760
bytes as I've asked for.

521
00:27:15,760 --> 00:27:20,190
Otherwise, suppose that
that bin is empty.

522
00:27:20,190 --> 00:27:22,930
There's nothing in that
bin at this time.

523
00:27:22,930 --> 00:27:28,250
Well, then what you do is you
start looking to find a block

524
00:27:28,250 --> 00:27:32,430
in the next larger non-empty
bin, and you split it up.

525
00:27:32,430 --> 00:27:34,180
So for example here,
suppose that the

526
00:27:34,180 --> 00:27:36,820
request is x equals 3.

527
00:27:36,820 --> 00:27:40,705
I have to look in bin 2, which
is blocks of size 4.

528
00:27:40,705 --> 00:27:42,960
Uh oh, bin 2 is empty.

529
00:27:42,960 --> 00:27:46,001
So what I do is, I start going
down, looking in the next bin.

530
00:27:46,001 --> 00:27:48,190
Oops, that's empty too,
that's empty--

531
00:27:48,190 --> 00:27:51,960
and then finally, I get to one
where there is a block in it.

532
00:27:51,960 --> 00:27:54,160
So what I do then is
I dice this up.

533
00:27:54,160 --> 00:27:59,090
I cut it in half, and then cut
that one in half, until I get

534
00:27:59,090 --> 00:28:02,490
a piece that's exactly the
size I want to return.

535
00:28:02,490 --> 00:28:03,180
OK?

536
00:28:03,180 --> 00:28:05,960
And then I distribute all of
these guys up to the other

537
00:28:05,960 --> 00:28:07,210
bins, like this.

538
00:28:07,210 --> 00:28:12,230

539
00:28:12,230 --> 00:28:15,570
So I return one of this size,
and I populate these other

540
00:28:15,570 --> 00:28:22,490
guys with one block each of
the appropriate size.

541
00:28:22,490 --> 00:28:24,910
And as you can see, they're
basically growing in a

542
00:28:24,910 --> 00:28:27,630
geometric fashion.

543
00:28:27,630 --> 00:28:30,070
And then I distribute
that bin.

544
00:28:30,070 --> 00:28:33,760
Now there's a caveat here.

545
00:28:33,760 --> 00:28:39,230
Suppose that no larger blocks
exist in my storage structure.

546
00:28:39,230 --> 00:28:43,720
Well, then I have to ask the
operating system for more

547
00:28:43,720 --> 00:28:50,860
bytes, to allocate x more
bytes of virtual memory.

548
00:28:50,860 --> 00:28:53,345
Now in practice what you do is
you don't actually ask for x,

549
00:28:53,345 --> 00:28:54,620
since you never ask
the operating

550
00:28:54,620 --> 00:28:56,590
system for small values.

551
00:28:56,590 --> 00:28:59,300
You always ask the operating
system for big values.

552
00:28:59,300 --> 00:29:01,170
Give me another page worth,
or give me another

553
00:29:01,170 --> 00:29:03,110
five pages, or something.

554
00:29:03,110 --> 00:29:06,000
Because that's an expensive call
to the operating system,

555
00:29:06,000 --> 00:29:09,130
to ask it to allocate
more storage.

556
00:29:09,130 --> 00:29:10,570
Let me explain how that works.

557
00:29:10,570 --> 00:29:13,070
And this gets back to the
question you had before, which

558
00:29:13,070 --> 00:29:16,460
is, where's the stuff
coming from anyway?

559
00:29:16,460 --> 00:29:19,570
What the operating system is
allocating is actually not

560
00:29:19,570 --> 00:29:24,620
physical pages, but rather
parts of virtual memory.

561
00:29:24,620 --> 00:29:29,830
So we looked at this before,
about how virtual memory is

562
00:29:29,830 --> 00:29:33,150
laid out for a typical
program.

563
00:29:33,150 --> 00:29:35,790
So what we do is, on the very
high addresses, we have the

564
00:29:35,790 --> 00:29:37,760
stack, and the stack
grows downwards.

565
00:29:37,760 --> 00:29:39,480
This is the call stack.

566
00:29:39,480 --> 00:29:41,750
At the low address, we typically
start out with

567
00:29:41,750 --> 00:29:42,950
what's called the text segment,

568
00:29:42,950 --> 00:29:44,200
which holds your code.

569
00:29:44,200 --> 00:29:46,320

570
00:29:46,320 --> 00:29:49,860
Then there's a region called
the data segment, and this

571
00:29:49,860 --> 00:29:54,480
consists of data that needs
to be initialized.

572
00:29:54,480 --> 00:29:58,320
So the compiler typically
distinguishes between data

573
00:29:58,320 --> 00:30:01,160
that needs initialization and
data which is going to be

574
00:30:01,160 --> 00:30:05,150
zero, because data that needs
initialization, it writes into

575
00:30:05,150 --> 00:30:08,610
the binary executable file
and stores that on disk.

576
00:30:08,610 --> 00:30:11,580
So when you load in your
program, it just loads as part

577
00:30:11,580 --> 00:30:14,710
of the text here, it loads that
data segment, saying what

578
00:30:14,710 --> 00:30:16,370
all of the storage is.

579
00:30:16,370 --> 00:30:19,390
The next segment is, for
historical reasons, called the

580
00:30:19,390 --> 00:30:21,250
BSS segment.

581
00:30:21,250 --> 00:30:26,780
And what BSS consists of is all
the variables that at the

582
00:30:26,780 --> 00:30:30,480
start of the program should
be initialized to 0.

583
00:30:30,480 --> 00:30:34,700
It doesn't bother storing
these in the file.

584
00:30:34,700 --> 00:30:35,500
Why not?

585
00:30:35,500 --> 00:30:38,520
Because it's easy enough to just
bring it in, understand

586
00:30:38,520 --> 00:30:43,980
what that distance there is, and
then do a memset of this

587
00:30:43,980 --> 00:30:48,170
whole region with zero, to
zero it out, rather than

588
00:30:48,170 --> 00:30:50,185
trying to read in a whole
bunch of zeroes.

589
00:30:50,185 --> 00:30:52,800

590
00:30:52,800 --> 00:30:55,790
An obvious kind of
data compression.

591
00:30:55,790 --> 00:30:59,740
Then heap storage is allocated
in the space

592
00:30:59,740 --> 00:31:02,920
above the BSS segment.

593
00:31:02,920 --> 00:31:05,700
So the idea is, there's a part
of heap which has already been

594
00:31:05,700 --> 00:31:07,360
allocated stuff.

595
00:31:07,360 --> 00:31:10,280
And whenever you need more
heap storage, you ask the

596
00:31:10,280 --> 00:31:13,400
operating system-- the operating
system moves just as

597
00:31:13,400 --> 00:31:18,850
if it were a stack, moves the
line for where heap is to

598
00:31:18,850 --> 00:31:22,260
allocate, to give you more pages
of heap storage that you

599
00:31:22,260 --> 00:31:25,660
can now allocate out of.

600
00:31:25,660 --> 00:31:27,110
And what it's doing
is allocating

601
00:31:27,110 --> 00:31:28,360
pieces of virtual memory.

602
00:31:28,360 --> 00:31:31,380

603
00:31:31,380 --> 00:31:37,630
Now, here's a good mind
question, which I think gets

604
00:31:37,630 --> 00:31:40,570
directly to your point.

605
00:31:40,570 --> 00:31:42,070
So think about this.

606
00:31:42,070 --> 00:31:45,370
We have a 64-bit
address space.

607
00:31:45,370 --> 00:31:49,480
If I try to write at the rate of
4 billion bytes per second,

608
00:31:49,480 --> 00:31:51,640
that's pretty fast, because
that's the speed at which the

609
00:31:51,640 --> 00:31:52,590
registers get written.

610
00:31:52,590 --> 00:31:55,360
Not the speed that memory
gets written.

611
00:31:55,360 --> 00:32:00,060
So really, it's something like
4 billion bytes per second.

612
00:32:00,060 --> 00:32:07,330
It takes over a century to write
all of virtual memory.

613
00:32:07,330 --> 00:32:10,840
So as a practical matter,
that means I never run

614
00:32:10,840 --> 00:32:12,600
out of stack space.

615
00:32:12,600 --> 00:32:16,370
I never run out of heap space.

616
00:32:16,370 --> 00:32:18,910
They're never going to cross,
no matter how much I'm

617
00:32:18,910 --> 00:32:19,760
allocating.

618
00:32:19,760 --> 00:32:22,980
Even if I allocate on every
cycle, there's not going to be

619
00:32:22,980 --> 00:32:24,780
a program-- unless you're going
to set a program running

620
00:32:24,780 --> 00:32:28,670
for a millennium sort
of thing, you're not

621
00:32:28,670 --> 00:32:29,550
going to see that.

622
00:32:29,550 --> 00:32:33,190
Most programs run for
a few minutes.

623
00:32:33,190 --> 00:32:34,205
Some may run for days.

624
00:32:34,205 --> 00:32:35,980
Some may run for years.

625
00:32:35,980 --> 00:32:38,220
Very few do people write
to run for a century.

626
00:32:38,220 --> 00:32:41,370

627
00:32:41,370 --> 00:32:44,920
So why not just allocate out
of free virtual memory and

628
00:32:44,920 --> 00:32:45,850
never bother freeing?

629
00:32:45,850 --> 00:32:48,810
Why am I worried about freeing
stuff all the time?

630
00:32:48,810 --> 00:32:50,080
Why not just allocate it?

631
00:32:50,080 --> 00:32:50,860
It's virtual memory.

632
00:32:50,860 --> 00:32:52,150
It's virtual.

633
00:32:52,150 --> 00:32:55,290
I just allocate more storage
as I need it.

634
00:32:55,290 --> 00:32:56,090
Yeah?

635
00:32:56,090 --> 00:32:57,870
AUDIENCE: [INAUDIBLE] memory?

636
00:32:57,870 --> 00:32:59,410
CHARLES LEISERSON: So it is
tied to real memory, and

637
00:32:59,410 --> 00:33:01,070
that's part of the answer.

638
00:33:01,070 --> 00:33:03,800
It is tied to real memory,
because anything that you

639
00:33:03,800 --> 00:33:09,030
write must find itself
a place on disk.

640
00:33:09,030 --> 00:33:11,110
Because it's going to be written
to pages, and those

641
00:33:11,110 --> 00:33:14,220
pages are going to be paged
and then written to what's

642
00:33:14,220 --> 00:33:17,680
called the swap space on
disk, which allows you

643
00:33:17,680 --> 00:33:18,810
to put things there.

644
00:33:18,810 --> 00:33:23,320
Even so, though, writing
to disk--

645
00:33:23,320 --> 00:33:26,450
it's still not going
to take very long.

646
00:33:26,450 --> 00:33:29,000
I mean, you're never going to
be able to use up disk,

647
00:33:29,000 --> 00:33:31,580
because disk takes so
long to write for.

648
00:33:31,580 --> 00:33:33,109
But that is part
of the answer.

649
00:33:33,109 --> 00:33:38,268
AUDIENCE: So you need to
basically use [INAUDIBLE]

650
00:33:38,268 --> 00:33:39,206
software?

651
00:33:39,206 --> 00:33:41,040
And so--

652
00:33:41,040 --> 00:33:42,520
CHARLES LEISERSON: And
the page table.

653
00:33:42,520 --> 00:33:44,560
AUDIENCE: [UNINTELLIGIBLE]

654
00:33:44,560 --> 00:33:45,260
on the disk.

655
00:33:45,260 --> 00:33:47,480
CHARLES LEISERSON: So the issue
is, this is where the

656
00:33:47,480 --> 00:33:50,950
external fragmentation
would be horrendous.

657
00:33:50,950 --> 00:33:53,850
If you just kept writing across
memory, and you only

658
00:33:53,850 --> 00:33:56,450
had a couple of words written
on every page that were

659
00:33:56,450 --> 00:34:01,950
actually used, the rest is just
garbage, then what you're

660
00:34:01,950 --> 00:34:06,370
saying is that in order to
access a given datum, I'm not

661
00:34:06,370 --> 00:34:09,659
going to have it in memory with
it a lot of other stuff.

662
00:34:09,659 --> 00:34:11,380
What I'm going to have in memory
with it is going to be

663
00:34:11,380 --> 00:34:15,500
all garbage, this stuff that
I would have freed.

664
00:34:15,500 --> 00:34:18,100
And so therefore, you end
up having your program

665
00:34:18,100 --> 00:34:22,870
distributed across a large
region of virtual memory, and

666
00:34:22,870 --> 00:34:25,330
that means as we're doing the
paging, et cetera, we're

667
00:34:25,330 --> 00:34:30,330
getting very poor
page locality as

668
00:34:30,330 --> 00:34:31,909
I'm accessing things.

669
00:34:31,909 --> 00:34:35,409
And so I have a finite number of
pages that I fit in memory,

670
00:34:35,409 --> 00:34:38,610
and now I may be blowing that
out, and every access, rather

671
00:34:38,610 --> 00:34:40,520
than going to something
nearby, is

672
00:34:40,520 --> 00:34:42,260
instead going to disk.

673
00:34:42,260 --> 00:34:46,770
And so that can cause disk
thrashing, where every access

674
00:34:46,770 --> 00:34:48,780
causes a disk access.

675
00:34:48,780 --> 00:34:51,830
So how fast are disk accesses?

676
00:34:51,830 --> 00:34:53,080
Order of magnitude?

677
00:34:53,080 --> 00:34:56,896

678
00:34:56,896 --> 00:34:58,550
How fast are disk accesses?

679
00:34:58,550 --> 00:35:00,680
What's a disk access cost?

680
00:35:00,680 --> 00:35:01,930
Fast disk?

681
00:35:01,930 --> 00:35:04,656

682
00:35:04,656 --> 00:35:05,906
AUDIENCE: [INAUDIBLE]

683
00:35:05,906 --> 00:35:10,057

684
00:35:10,057 --> 00:35:14,054
CHARLES LEISERSON: No, it's
slower than that.

685
00:35:14,054 --> 00:35:15,497
Disk access.

686
00:35:15,497 --> 00:35:18,470
So good memory access, DRAM
memory is, like, 60

687
00:35:18,470 --> 00:35:22,070
nanoseconds for DRAM access.

688
00:35:22,070 --> 00:35:23,810
It's like 60 nanoseconds.

689
00:35:23,810 --> 00:35:28,460
So the processor is going
somewhere around, these days,

690
00:35:28,460 --> 00:35:32,370
between 3 and 4 gigahertz.

691
00:35:32,370 --> 00:35:39,210
So memory is like 200 processor
cycles away.

692
00:35:39,210 --> 00:35:41,950
How fast is disk?

693
00:35:41,950 --> 00:35:45,050
An ordinary disk, not a
solid state device.

694
00:35:45,050 --> 00:35:46,300
Ordinary disk.

695
00:35:46,300 --> 00:35:48,720

696
00:35:48,720 --> 00:35:51,840
10 milliseconds is a good
ballpark to give.

697
00:35:51,840 --> 00:35:53,040
That's a good number to know.

698
00:35:53,040 --> 00:35:57,590
10 milliseconds, not 10
microseconds, not 10

699
00:35:57,590 --> 00:35:58,780
nanoseconds.

700
00:35:58,780 --> 00:36:01,510
10 milliseconds.

701
00:36:01,510 --> 00:36:05,510
So you are, what?

702
00:36:05,510 --> 00:36:09,130
6 to 7 orders of magnitude
slower if you go to disk.

703
00:36:09,130 --> 00:36:11,930

704
00:36:11,930 --> 00:36:13,840
Really slow.

705
00:36:13,840 --> 00:36:16,280
So one of the things in
performance optimization--

706
00:36:16,280 --> 00:36:18,630
we're focusing a lot
on the CPU--

707
00:36:18,630 --> 00:36:21,070
is often the case for
database things.

708
00:36:21,070 --> 00:36:23,530
How do you minimize those
disk accesses?

709
00:36:23,530 --> 00:36:26,640
And all the things that we're
learning about, performance

710
00:36:26,640 --> 00:36:34,040
for CPUs, can be applied to
disk accesses, as well.

711
00:36:34,040 --> 00:36:37,280
So the idea here is
we want to avoid--

712
00:36:37,280 --> 00:36:40,280
that's why we don't want to have
our stuff strewn across a

713
00:36:40,280 --> 00:36:43,670
whole bunch of pages where we
only have a fixed amount of

714
00:36:43,670 --> 00:36:44,980
internal memory, because
then we'll end up

715
00:36:44,980 --> 00:36:46,640
paging all the time.

716
00:36:46,640 --> 00:36:50,500
Every access will cause us to
do a disk access, because it

717
00:36:50,500 --> 00:36:51,820
won't be very likely
that we're getting

718
00:36:51,820 --> 00:36:53,400
stuff on the same page.

719
00:36:53,400 --> 00:36:55,630
So getting back, for example,
to the fixed allocation,

720
00:36:55,630 --> 00:37:00,040
that's why it's a good idea to
try to keep those pages full,

721
00:37:00,040 --> 00:37:01,790
so that we have a small thing.

722
00:37:01,790 --> 00:37:06,480
So the goal of most storage
allocators is therefore to use

723
00:37:06,480 --> 00:37:11,710
as little virtual memory as
possible, and try to keep the

724
00:37:11,710 --> 00:37:14,130
used portions very compact.

725
00:37:14,130 --> 00:37:18,030
Because the more compact it is,
the more likely it is that

726
00:37:18,030 --> 00:37:21,180
when you access one, you'll be
able to access something

727
00:37:21,180 --> 00:37:25,360
nearby without causing
a page fault.

728
00:37:25,360 --> 00:37:27,940
So that's the goal of almost
all storage allocators.

729
00:37:27,940 --> 00:37:31,580
Try to keep things compact, try
to keep things tight and

730
00:37:31,580 --> 00:37:35,050
in very little virtual memory
space, even though virtual

731
00:37:35,050 --> 00:37:36,640
memory is huge.

732
00:37:36,640 --> 00:37:38,790
Now, that doesn't mean that
everything has to be

733
00:37:38,790 --> 00:37:39,430
contiguous.

734
00:37:39,430 --> 00:37:42,650
For example, in this
organization as I showed

735
00:37:42,650 --> 00:37:44,730
before, actually, I have it just
on the previous slide, so

736
00:37:44,730 --> 00:37:47,360
why don't I go back there--

737
00:37:47,360 --> 00:37:50,660
these are very far away in
memory, but there's only a

738
00:37:50,660 --> 00:37:55,970
couple of them, and they're
tight and local in the regions

739
00:37:55,970 --> 00:37:57,650
that they're operating in.

740
00:37:57,650 --> 00:38:00,160
What I don't want to do is
have it where I have just

741
00:38:00,160 --> 00:38:01,560
little stuff.

742
00:38:01,560 --> 00:38:05,460
It's OK to go into the middle
of virtual memory and have a

743
00:38:05,460 --> 00:38:07,760
whole bunch of consecutive
pages full of data.

744
00:38:07,760 --> 00:38:09,360
That's fine.

745
00:38:09,360 --> 00:38:12,060
What's bad is having just a
little bit of data, a little

746
00:38:12,060 --> 00:38:14,120
bit of data, and a little
bit of data sort of

747
00:38:14,120 --> 00:38:15,290
sprinkled all over.

748
00:38:15,290 --> 00:38:16,680
That's bad.

749
00:38:16,680 --> 00:38:19,930
But as long as I can put a whole
bunch of data nearby, it

750
00:38:19,930 --> 00:38:22,950
doesn't matter where in virtual
memory I put it,

751
00:38:22,950 --> 00:38:24,200
particularly.

752
00:38:24,200 --> 00:38:26,100

753
00:38:26,100 --> 00:38:29,090
So the idea is, let's try to use
as little virtual memory

754
00:38:29,090 --> 00:38:29,980
as possible.

755
00:38:29,980 --> 00:38:31,230
Any questions about that?

756
00:38:31,230 --> 00:38:36,180

757
00:38:36,180 --> 00:38:37,227
OK.

758
00:38:37,227 --> 00:38:38,660
So how do we do that?

759
00:38:38,660 --> 00:38:43,270
So the first thing is, binned
free list turns out to do a

760
00:38:43,270 --> 00:38:49,170
really good job of using
relatively little memory.

761
00:38:49,170 --> 00:38:55,930
So suppose that the user program
is using, at most, m

762
00:38:55,930 --> 00:38:58,020
memory at any point during
the execution.

763
00:38:58,020 --> 00:38:59,770
Maybe allocating,
maybe freeing.

764
00:38:59,770 --> 00:39:02,330
But if I took the high watermark
of everything he

765
00:39:02,330 --> 00:39:06,810
says that he needs,
we call that m.

766
00:39:06,810 --> 00:39:11,440
Then it turns out that if you
use a bin free list allocator,

767
00:39:11,440 --> 00:39:14,330
the amount of virtual memory
consumed is at

768
00:39:14,330 --> 00:39:17,480
most m times log n.

769
00:39:17,480 --> 00:39:20,170
Order m times log n.

770
00:39:20,170 --> 00:39:21,160
And why is that?

771
00:39:21,160 --> 00:39:24,000
Here's kind of a hand-wavy
proof sketch.

772
00:39:24,000 --> 00:39:28,370
So whenever I make a request for
a block of size x, it may

773
00:39:28,370 --> 00:39:32,970
consume at most twice x storage

774
00:39:32,970 --> 00:39:34,710
that's actually in use.

775
00:39:34,710 --> 00:39:38,840
And so it turns out that the
total amount of memory devoted

776
00:39:38,840 --> 00:39:43,540
to blocks of size 2 to the k
is therefore at most 2m.

777
00:39:43,540 --> 00:39:48,430
So for any given size, I might,
at any given point in

778
00:39:48,430 --> 00:39:53,140
time, have the program using all
blocks of that given size.

779
00:39:53,140 --> 00:39:57,200
Let's say size 2 to the k.

780
00:39:57,200 --> 00:40:01,350
And since there are a
logarithmic number of those

781
00:40:01,350 --> 00:40:09,690
bins, each bin of which could
have a size at most m, the

782
00:40:09,690 --> 00:40:11,320
total is order m log m.

783
00:40:11,320 --> 00:40:13,980

784
00:40:13,980 --> 00:40:16,990
Now it turns out that there's
actually a stronger property.

785
00:40:16,990 --> 00:40:21,550
In fact, you can show that
binned free list is order one

786
00:40:21,550 --> 00:40:22,230
competitive.

787
00:40:22,230 --> 00:40:24,620
Who knows the term "competitive"
in this sense?

788
00:40:24,620 --> 00:40:27,530
The algorithmic meaning
of "competitive"?

789
00:40:27,530 --> 00:40:32,180
So what it means is, it's at
most a constant factor slower

790
00:40:32,180 --> 00:40:35,850
than the optimal omniscient
algorithm that could figure

791
00:40:35,850 --> 00:40:39,420
out exactly what's going
on in every step.

792
00:40:39,420 --> 00:40:42,280
So you're within a constant
factor competitive with the

793
00:40:42,280 --> 00:40:45,970
optimal allocator, assuming
that the optimal allocator

794
00:40:45,970 --> 00:40:49,470
does not do what's called
coalescing.

795
00:40:49,470 --> 00:40:52,170
If it simply is dividing up
things, then it can subdivide,

796
00:40:52,170 --> 00:40:55,390
but it can never glue things
back together again.

797
00:40:55,390 --> 00:40:58,150
It turns out that in fact,
binned free list is really

798
00:40:58,150 --> 00:41:01,430
much better than that.

799
00:41:01,430 --> 00:41:04,830
So let's talk a bit about
coalescing, because many, many

800
00:41:04,830 --> 00:41:06,375
memory allocators
use coalescing.

801
00:41:06,375 --> 00:41:09,220

802
00:41:09,220 --> 00:41:14,250
So the idea is, I'm concerned
about the case where I end up

803
00:41:14,250 --> 00:41:18,380
with a whole bunch of little
small objects, and yet the

804
00:41:18,380 --> 00:41:21,110
allocations now come
from big chunks.

805
00:41:21,110 --> 00:41:25,340
And so I have to go and use more
of my virtual memory, or

806
00:41:25,340 --> 00:41:27,970
maybe I can glue together those
little pieces into a big

807
00:41:27,970 --> 00:41:30,100
chunk, and not have to
go off to memory.

808
00:41:30,100 --> 00:41:32,670
So that's what coalescing
is all about.

809
00:41:32,670 --> 00:41:35,610
So the idea is, if two things
are adjacent, let

810
00:41:35,610 --> 00:41:37,490
me glue them together.

811
00:41:37,490 --> 00:41:40,060
And there are really clever
systems for doing that,

812
00:41:40,060 --> 00:41:44,450
including a very famous one
called the buddy system.

813
00:41:44,450 --> 00:41:47,820
But using any of these
mechanisms for coalescing, you

814
00:41:47,820 --> 00:41:50,730
always have to trade off between
the simplicity of not

815
00:41:50,730 --> 00:41:52,690
coalescing.

816
00:41:52,690 --> 00:41:55,710
Not coalescing so stick it in
the free list, I'm done.

817
00:41:55,710 --> 00:41:57,950
Coalescing is, stick it in the
free list, oh, am I next to

818
00:41:57,950 --> 00:42:00,130
somebody I can merge with?

819
00:42:00,130 --> 00:42:01,400
Now let me merge.

820
00:42:01,400 --> 00:42:05,460
So you always have to weigh, is
that really going to be an

821
00:42:05,460 --> 00:42:07,290
advantageous thing?

822
00:42:07,290 --> 00:42:10,730
It turns out there are no good
theoretical bounds that I'm

823
00:42:10,730 --> 00:42:15,060
aware that prove the
effectiveness of coalescing.

824
00:42:15,060 --> 00:42:17,740
People have not figured out how
to analyze this, or what

825
00:42:17,740 --> 00:42:20,620
an appropriate model is for
understanding whether

826
00:42:20,620 --> 00:42:22,940
coalescing is a good idea.

827
00:42:22,940 --> 00:42:26,090
However, it does seem to work
in practice for some things,

828
00:42:26,090 --> 00:42:29,510
and I think the reason for
that-- and this is just my own

829
00:42:29,510 --> 00:42:30,680
take on it--

830
00:42:30,680 --> 00:42:34,090
is that storage tends to be
deallocated as a stack.

831
00:42:34,090 --> 00:42:37,370
So you tend to go into a region
of a program where you

832
00:42:37,370 --> 00:42:39,200
do a lot of heap allocation--

833
00:42:39,200 --> 00:42:41,690
let's say I'm building up a
graph, and I keep doing a

834
00:42:41,690 --> 00:42:42,570
whole bunch of mallocs.

835
00:42:42,570 --> 00:42:44,840
And when I'm done with the
graph, what do I do?

836
00:42:44,840 --> 00:42:47,320
I free it all.

837
00:42:47,320 --> 00:42:52,250
So I tend to have these batch
allocations and then

838
00:42:52,250 --> 00:42:58,130
deallocations, and they tend
to work often as a stack,

839
00:42:58,130 --> 00:42:59,120
often in a batch.

840
00:42:59,120 --> 00:43:02,280
And that means that when I do
that freeing, I can sometimes

841
00:43:02,280 --> 00:43:05,550
take advantage of that
and do coalescing.

842
00:43:05,550 --> 00:43:14,130
In some systems, people have
what are called memory pools,

843
00:43:14,130 --> 00:43:17,440
where you have a region where
when you allocate out of it,

844
00:43:17,440 --> 00:43:20,640
you know you're allocating out
of your own particular region

845
00:43:20,640 --> 00:43:24,220
specifically so that you can
do coalescing of things in

846
00:43:24,220 --> 00:43:26,100
that region.

847
00:43:26,100 --> 00:43:28,760
And so you may have multiple
allocators, and you'll call

848
00:43:28,760 --> 00:43:31,930
the allocator on the region,
say, for a graph, because

849
00:43:31,930 --> 00:43:34,910
let's imagine you're creating
two graphs, and one of them

850
00:43:34,910 --> 00:43:36,140
you're going to keep around,
and the other

851
00:43:36,140 --> 00:43:37,060
you're going to dump.

852
00:43:37,060 --> 00:43:39,340
Well, if I create them at the
same time, the storage gets

853
00:43:39,340 --> 00:43:43,120
all intermixed, I'll deallocate
one graph.

854
00:43:43,120 --> 00:43:46,020
The other one will be, I'll have
fragmentation all over.

855
00:43:46,020 --> 00:43:48,060
These little spots appeared
where the other

856
00:43:48,060 --> 00:43:49,380
graph used to be--

857
00:43:49,380 --> 00:43:50,380
not so good.

858
00:43:50,380 --> 00:43:53,990
But if I say, use this allocator
for this graph, use

859
00:43:53,990 --> 00:43:59,310
that allocator for that graph,
then when I'm done and I'm

860
00:43:59,310 --> 00:44:01,790
allocating those out of two
different regions of memory,

861
00:44:01,790 --> 00:44:06,000
that when I'm done with the
first graph and deallocate

862
00:44:06,000 --> 00:44:08,260
everything, now I've created
a nice big piece

863
00:44:08,260 --> 00:44:10,650
of contiguous storage.

864
00:44:10,650 --> 00:44:16,610
So people will sometimes play
with different memory pools

865
00:44:16,610 --> 00:44:23,550
for allocators, and C++ has a
fairly rich set of libraries

866
00:44:23,550 --> 00:44:26,220
and so forth for that,
to allow you to do

867
00:44:26,220 --> 00:44:27,470
that kind of thing.

868
00:44:27,470 --> 00:44:30,170

869
00:44:30,170 --> 00:44:35,800
So let's go on and talk about
garbage collection, because

870
00:44:35,800 --> 00:44:38,040
that's something you've probably
seen in Java and

871
00:44:38,040 --> 00:44:40,940
Python, and what's going on
under the covers there.

872
00:44:40,940 --> 00:44:45,210
Because even though C and
C++ do not have garbage

873
00:44:45,210 --> 00:44:46,730
collection, you can
actually implement

874
00:44:46,730 --> 00:44:47,860
your own garbage collector.

875
00:44:47,860 --> 00:44:52,620
It's not that hard, except
for some details.

876
00:44:52,620 --> 00:44:55,260

877
00:44:55,260 --> 00:44:59,840
It's only hard when you want
it to perform well.

878
00:44:59,840 --> 00:45:02,180
So the idea is to free
the programmer

879
00:45:02,180 --> 00:45:04,310
from freeing objects.

880
00:45:04,310 --> 00:45:06,490
The downside, by the way, is
when you free the programmer

881
00:45:06,490 --> 00:45:08,200
from freeing objects,
they tend to

882
00:45:08,200 --> 00:45:10,570
create lots of garbage.

883
00:45:10,570 --> 00:45:13,460
And so what happens in Java
programs, for example, is

884
00:45:13,460 --> 00:45:15,120
you'll have these--

885
00:45:15,120 --> 00:45:19,460
I remember once having a Java
program where, when it was

886
00:45:19,460 --> 00:45:23,640
sitting, waiting for user input,
it would periodically

887
00:45:23,640 --> 00:45:25,750
go, garbage collect.

888
00:45:25,750 --> 00:45:27,576
It was just sitting
there, and then it

889
00:45:27,576 --> 00:45:29,050
would go, garbage collect.

890
00:45:29,050 --> 00:45:30,010
We weren't doing anything.

891
00:45:30,010 --> 00:45:33,460
It was just periodically, the
way they had written the code,

892
00:45:33,460 --> 00:45:36,940
they were somehow generating
garbage, doing allocations and

893
00:45:36,940 --> 00:45:40,540
frees, when you were sitting
around doing nothing.

894
00:45:40,540 --> 00:45:43,510
And so it would basically use
up all of the space and then

895
00:45:43,510 --> 00:45:45,530
do a garbage collection, use
up all the space and do a

896
00:45:45,530 --> 00:45:46,260
garbage collection.

897
00:45:46,260 --> 00:45:48,840
Naturally that's kind
of a slow system.

898
00:45:48,840 --> 00:45:51,520
And so even in something like
Java, where you have a garbage

899
00:45:51,520 --> 00:45:53,770
collector, you do have to worry
about whether you're

900
00:45:53,770 --> 00:45:57,200
creating garbage or not, and you
can generate faster code

901
00:45:57,200 --> 00:46:00,820
by not creating as much garbage,
or by finding ways of

902
00:46:00,820 --> 00:46:03,120
recycling the garbage
yourself.

903
00:46:03,120 --> 00:46:06,580
So even in Java, it's worthwhile
saying, give me a

904
00:46:06,580 --> 00:46:08,890
big chunk of memory, let
me manage it myself,

905
00:46:08,890 --> 00:46:09,960
now let me free it.

906
00:46:09,960 --> 00:46:12,220
I don't have to free it, but
when I'm done with it, I'll be

907
00:46:12,220 --> 00:46:12,670
done with it.

908
00:46:12,670 --> 00:46:15,170
But in the meantime, let me
manage it myself so that I

909
00:46:15,170 --> 00:46:20,790
don't have it continually
generating more garbage.

910
00:46:20,790 --> 00:46:23,790
So the garbage collector
can be built in, or

911
00:46:23,790 --> 00:46:25,350
you can do it yourself.

912
00:46:25,350 --> 00:46:29,020
And mostly in this class, we do
stuff ourselves, so we know

913
00:46:29,020 --> 00:46:30,290
how they work.

914
00:46:30,290 --> 00:46:32,840
So that even when you're doing
it with built in, you know

915
00:46:32,840 --> 00:46:36,470
what's going on under
the covers.

916
00:46:36,470 --> 00:46:38,450
So here's some terminology.

917
00:46:38,450 --> 00:46:41,660
The roots in garbage collection
are the objects

918
00:46:41,660 --> 00:46:43,930
that are directly accessible
by the program.

919
00:46:43,930 --> 00:46:46,770
And those typically are things
like globals, stacked

920
00:46:46,770 --> 00:46:48,570
variables, and so forth.

921
00:46:48,570 --> 00:46:53,830
Things that the program can
directly address by name.

922
00:46:53,830 --> 00:46:57,100
The live objects are reachable
from the roots by following

923
00:46:57,100 --> 00:47:00,890
pointers, and the debt objects
are the ones that the

924
00:47:00,890 --> 00:47:04,790
programmer can never
get to again.

925
00:47:04,790 --> 00:47:08,180
And so what we're interested in
doing in garbage collection

926
00:47:08,180 --> 00:47:12,610
is finding all the dead objects
and recycling them so

927
00:47:12,610 --> 00:47:15,060
we can use them again, so that
they don't end up taking up

928
00:47:15,060 --> 00:47:19,590
space in our virtual memory and
slowing us down by making

929
00:47:19,590 --> 00:47:22,140
our virtual memory space
continue to grow and grow and

930
00:47:22,140 --> 00:47:27,050
grow across many disk pages.

931
00:47:27,050 --> 00:47:33,220
So one of the questions, in
order for one to be able to do

932
00:47:33,220 --> 00:47:36,560
garbage collection, is
how do I know where

933
00:47:36,560 --> 00:47:39,080
pointers are in objects?

934
00:47:39,080 --> 00:47:41,200
How do I know what's
accessible?

935
00:47:41,200 --> 00:47:44,590
And so typically, in order to
do that, you have to have a

936
00:47:44,590 --> 00:47:48,730
very strong idea of where is a
pointer and where is data?

937
00:47:48,730 --> 00:47:50,630
Because you don't want to
follow data as if it's a

938
00:47:50,630 --> 00:47:54,220
pointer, you want to know which
regions of a given block

939
00:47:54,220 --> 00:47:56,090
of memory are pointers.

940
00:47:56,090 --> 00:47:59,370
And so strong typing helps a
lot with that, which is why

941
00:47:59,370 --> 00:48:02,540
mostly people do garbage
collection in strongly typed

942
00:48:02,540 --> 00:48:05,590
languages, because then you
can actually identify it.

943
00:48:05,590 --> 00:48:10,040
It also requires you to prohibit
pointer arithmetic.

944
00:48:10,040 --> 00:48:12,280
So the ability to take a pointer
and then index off

945
00:48:12,280 --> 00:48:16,000
that pointer means I can get to
some piece of storage that

946
00:48:16,000 --> 00:48:18,140
I can't access directly.

947
00:48:18,140 --> 00:48:20,640
And so typically in these kinds
of systems, pointer

948
00:48:20,640 --> 00:48:23,940
arithmetic is illegal.

949
00:48:23,940 --> 00:48:28,000
So you can do array indexing, as
long as you treat the array

950
00:48:28,000 --> 00:48:29,310
as a single object.

951
00:48:29,310 --> 00:48:33,450
You can't treat an array as
the component pieces.

952
00:48:33,450 --> 00:48:36,070
You have to treat it
as a single object.

953
00:48:36,070 --> 00:48:37,310
So pointer arithmetic.

954
00:48:37,310 --> 00:48:39,910
Well, you folks, who's
familiar with pointer

955
00:48:39,910 --> 00:48:41,420
arithmetic?

956
00:48:41,420 --> 00:48:43,320
At this point, almost all of
you have done stuff with

957
00:48:43,320 --> 00:48:47,770
pointer arithmetic, if your
code is any good.

958
00:48:47,770 --> 00:48:50,940
Almost all of you have done
adding to pointers in order to

959
00:48:50,940 --> 00:48:54,020
move through an array or what
have you as being somewhat

960
00:48:54,020 --> 00:48:57,370
cheaper than doing an array
index every time.

961
00:48:57,370 --> 00:49:02,460
It saves you an extra memory
reference, typically.

962
00:49:02,460 --> 00:49:05,300
So that's one of their
restrictions when you want to

963
00:49:05,300 --> 00:49:08,530
do garbage collection in
a general setting.

964
00:49:08,530 --> 00:49:11,380
Now probably the simplest
and most useful one as a

965
00:49:11,380 --> 00:49:15,150
programmer to use for garbage
collection is what's called

966
00:49:15,150 --> 00:49:16,680
reference counting.

967
00:49:16,680 --> 00:49:19,490
I know many, many programs where
what you do is you do

968
00:49:19,490 --> 00:49:23,130
reference counting
because you can.

969
00:49:23,130 --> 00:49:25,660
But as you'll see, it
has a limitation.

970
00:49:25,660 --> 00:49:28,950
So the idea in reference
counting is to keep a count of

971
00:49:28,950 --> 00:49:31,890
the number of pointers
referencing each object.

972
00:49:31,890 --> 00:49:33,780
So here we have a whole bunch
of different objects of

973
00:49:33,780 --> 00:49:36,630
different sizes, and whenever
there's a pointer going to the

974
00:49:36,630 --> 00:49:39,353
object, I add one to the
reference count shown in green

975
00:49:39,353 --> 00:49:41,070
at the top here.

976
00:49:41,070 --> 00:49:44,760
I hope I got the count right.

977
00:49:44,760 --> 00:49:50,820
Whenever the count drops to
zero, you free the object.

978
00:49:50,820 --> 00:49:52,600
That means nobody's pointing
to this object

979
00:49:52,600 --> 00:49:52,930
[UNINTELLIGIBLE].

980
00:49:52,930 --> 00:49:53,930
Let's free it.

981
00:49:53,930 --> 00:49:56,470
So let's see how that goes.

982
00:49:56,470 --> 00:49:58,840
So suppose that, for example,
that pointer got

983
00:49:58,840 --> 00:50:00,920
moved to over here.

984
00:50:00,920 --> 00:50:02,750
Went away from there,
got moved to there.

985
00:50:02,750 --> 00:50:05,800
Well, now there are two fields
that have to be updated.

986
00:50:05,800 --> 00:50:08,140
The one that we took the pointer
away from and the one

987
00:50:08,140 --> 00:50:09,700
that we put it to.

988
00:50:09,700 --> 00:50:11,850
So we update those.

989
00:50:11,850 --> 00:50:13,410
Oops, that guy went to zero.

990
00:50:13,410 --> 00:50:16,210
That means this stuff
is garbage.

991
00:50:16,210 --> 00:50:18,470
But I can't just immediately
go and collect

992
00:50:18,470 --> 00:50:19,490
that piece of storage.

993
00:50:19,490 --> 00:50:20,740
Why not?

994
00:50:20,740 --> 00:50:22,956

995
00:50:22,956 --> 00:50:25,932
AUDIENCE: To save the value of
the pointer, and [INAUDIBLE]

996
00:50:25,932 --> 00:50:26,924
pointer later?

997
00:50:26,924 --> 00:50:27,970
CHARLES LEISERSON:
Yeah, to save the

998
00:50:27,970 --> 00:50:29,230
value of the pointer--

999
00:50:29,230 --> 00:50:31,015
AUDIENCE: To save the value of
the pointer and then use it

1000
00:50:31,015 --> 00:50:32,152
again later.

1001
00:50:32,152 --> 00:50:35,074
The pointer isn't there,
but you know where--

1002
00:50:35,074 --> 00:50:37,022
CHARLES LEISERSON: Not
quite, not quite.

1003
00:50:37,022 --> 00:50:37,509
Yes?

1004
00:50:37,509 --> 00:50:38,970
AUDIENCE: [INAUDIBLE]

1005
00:50:38,970 --> 00:50:41,405
the reference counts for
[INAUDIBLE], and also--

1006
00:50:41,405 --> 00:50:41,892
CHARLES LEISERSON: Yeah.

1007
00:50:41,892 --> 00:50:45,490
If you're going to deallocate
this guy, he's got pointers.

1008
00:50:45,490 --> 00:50:48,440
They have to follow his pointers
and see which ones

1009
00:50:48,440 --> 00:50:50,690
need to be deallocated.

1010
00:50:50,690 --> 00:50:52,080
Good.

1011
00:50:52,080 --> 00:50:54,610
So this guy now, we realize,
is garbage.

1012
00:50:54,610 --> 00:50:59,000
But before we can just free him,
I've got to deallocate

1013
00:50:59,000 --> 00:51:01,410
these pointers.

1014
00:51:01,410 --> 00:51:03,510
So this guy also got
decremented, but he didn't.

1015
00:51:03,510 --> 00:51:08,150
So this guy, we have to
decrement these two fields,

1016
00:51:08,150 --> 00:51:10,150
because we took away those
pointers because we're going

1017
00:51:10,150 --> 00:51:12,050
to garbage collect that.

1018
00:51:12,050 --> 00:51:14,380
And when we do that, it turns
out this thing turns to

1019
00:51:14,380 --> 00:51:17,310
garbage, as well.

1020
00:51:17,310 --> 00:51:19,960
So when you do reference
counting, you have to make

1021
00:51:19,960 --> 00:51:23,672
sure that when you decrement
things and decide you're going

1022
00:51:23,672 --> 00:51:26,740
to free something, you also then
go through and free the

1023
00:51:26,740 --> 00:51:29,380
things, and you continue that
process until everything that

1024
00:51:29,380 --> 00:51:35,880
has been deallocated
can be deallocated.

1025
00:51:35,880 --> 00:51:39,340
And then this becomes
recyclable, put back into the

1026
00:51:39,340 --> 00:51:43,310
storage, into your free list or
whatever scheme you have.

1027
00:51:43,310 --> 00:51:48,130
So this is a very simple way,
when you're implementing data

1028
00:51:48,130 --> 00:51:50,070
structures, of building your
own dynamic storage.

1029
00:51:50,070 --> 00:51:52,930
Because you basically can
allocate stuff, you can free

1030
00:51:52,930 --> 00:51:58,230
it, and whenever the reference
count goes to 0, boom.

1031
00:51:58,230 --> 00:52:01,890
You just have it deallocated.

1032
00:52:01,890 --> 00:52:02,480
OK?

1033
00:52:02,480 --> 00:52:05,590
But you can program that
yourself [INAUDIBLE].

1034
00:52:05,590 --> 00:52:07,170
Now there's a problem with
the scheme, though.

1035
00:52:07,170 --> 00:52:09,710
What's the problem?

1036
00:52:09,710 --> 00:52:11,170
Cycles.

1037
00:52:11,170 --> 00:52:14,090
Yeah, cycles are the problem
for reference counting.

1038
00:52:14,090 --> 00:52:16,040
So here's an example.

1039
00:52:16,040 --> 00:52:18,890
The problem is that a cycle is
never garbage collected.

1040
00:52:18,890 --> 00:52:22,030
Let's take a look at this
structure here.

1041
00:52:22,030 --> 00:52:25,640
So everybody's got a pointer
from somewhere, but it turns

1042
00:52:25,640 --> 00:52:33,950
out that there's a cycle here
where, notice that they are

1043
00:52:33,950 --> 00:52:36,040
pointing to each other, we even
have a guy coming off

1044
00:52:36,040 --> 00:52:39,870
here, but nothing points
into the cycle.

1045
00:52:39,870 --> 00:52:42,760
So there's no way that, from the
roots, I can ever access

1046
00:52:42,760 --> 00:52:44,980
these four guys.

1047
00:52:44,980 --> 00:52:47,790
They're garbage, but my
reference counting scheme will

1048
00:52:47,790 --> 00:52:55,220
never find it, because no one's
ever going to decrement

1049
00:52:55,220 --> 00:52:57,810
one of those counters here.

1050
00:52:57,810 --> 00:53:01,060
So those are all garbage,
but how would I know?

1051
00:53:01,060 --> 00:53:03,550
And that's basically
the disadvantage

1052
00:53:03,550 --> 00:53:08,420
of reference counting.

1053
00:53:08,420 --> 00:53:13,860
And uncollected garbage,
as we all know, stinks.

1054
00:53:13,860 --> 00:53:17,630
So we don't want that
situation to occur.

1055
00:53:17,630 --> 00:53:21,860
So nevertheless, reference
counting is great if you've

1056
00:53:21,860 --> 00:53:24,650
got an acyclic structure.

1057
00:53:24,650 --> 00:53:28,070
And the number of times acyclic
structures come up

1058
00:53:28,070 --> 00:53:30,830
where reference counting is the
simple and easy way to do

1059
00:53:30,830 --> 00:53:35,540
the storage management is huge
when you get into doing any

1060
00:53:35,540 --> 00:53:37,530
kind of interesting
programming.

1061
00:53:37,530 --> 00:53:39,220
Lots and lots of opportunity
to use

1062
00:53:39,220 --> 00:53:41,840
reference counting approach.

1063
00:53:41,840 --> 00:53:43,420
So questions about reference
counting?

1064
00:53:43,420 --> 00:53:47,230

1065
00:53:47,230 --> 00:53:48,130
You have a question?

1066
00:53:48,130 --> 00:53:48,809
Yeah?

1067
00:53:48,809 --> 00:53:50,246
AUDIENCE: So this means you
actually have to do this

1068
00:53:50,246 --> 00:53:51,683
[UNINTELLIGIBLE]

1069
00:53:51,683 --> 00:53:53,120
ever do a [INAUDIBLE]?

1070
00:53:53,120 --> 00:53:55,040
You can, like, save the work--

1071
00:53:55,040 --> 00:53:55,650
CHARLES LEISERSON:
That's right.

1072
00:53:55,650 --> 00:53:58,380
Whenever you move a pointer,
if you're going to move a

1073
00:53:58,380 --> 00:54:02,180
pointer, you first decrement
the counter on the place

1074
00:54:02,180 --> 00:54:04,820
you're moving it from, and then
garbage collect if need

1075
00:54:04,820 --> 00:54:07,740
be, and then you move
the pointer--

1076
00:54:07,740 --> 00:54:09,930
that's typically what you do--
you move the pointer and

1077
00:54:09,930 --> 00:54:16,390
increment the count at that
location, which typically

1078
00:54:16,390 --> 00:54:19,790
doesn't require any extra work
to decrement, which is the

1079
00:54:19,790 --> 00:54:20,680
trick thing.

1080
00:54:20,680 --> 00:54:20,960
Sorry?

1081
00:54:20,960 --> 00:54:22,722
AUDIENCE: There's no good way
to do this in idle time.

1082
00:54:22,722 --> 00:54:26,180
Like I'll have some idle time
later, let me do it--

1083
00:54:26,180 --> 00:54:28,520
CHARLES LEISERSON: So that's
what you use true garbage

1084
00:54:28,520 --> 00:54:30,520
collection for Yeah?

1085
00:54:30,520 --> 00:54:33,790

1086
00:54:33,790 --> 00:54:35,196
AUDIENCE: You could actually,
there's techniques where you

1087
00:54:35,196 --> 00:54:38,680
can defer the work of
incrementing it.

1088
00:54:38,680 --> 00:54:40,390
Like you basically put the
increment and decrement

1089
00:54:40,390 --> 00:54:43,040
operations that you'd like to do
into a buffer, and then you

1090
00:54:43,040 --> 00:54:44,100
can do them later.

1091
00:54:44,100 --> 00:54:46,230
But it's sort of the same.

1092
00:54:46,230 --> 00:54:47,610
CHARLES LEISERSON: Yeah,
you could log.

1093
00:54:47,610 --> 00:54:50,350
So there are schemes where you
log that, oh, I decremented

1094
00:54:50,350 --> 00:54:52,590
this, better check it
later, type thing.

1095
00:54:52,590 --> 00:54:54,260
Or I decrement it to 0.

1096
00:54:54,260 --> 00:54:57,550
Let me just put it in there,
and I'll go actually do the

1097
00:54:57,550 --> 00:55:00,460
stuff in batch mode at
some later time.

1098
00:55:00,460 --> 00:55:01,150
Yep.

1099
00:55:01,150 --> 00:55:05,080
So programming is fun, because
you can be clever with

1100
00:55:05,080 --> 00:55:07,020
schemes like that.

1101
00:55:07,020 --> 00:55:09,430
And that could be quite a
reasonable thing to do, which

1102
00:55:09,430 --> 00:55:13,370
is wait until you're waiting on
the user for input, and at

1103
00:55:13,370 --> 00:55:14,730
that point, collect
your garbage.

1104
00:55:14,730 --> 00:55:17,970
Now, the thing you have to risk
is, what happens if that

1105
00:55:17,970 --> 00:55:20,690
takes a long time
to get there?

1106
00:55:20,690 --> 00:55:27,350
And then there's the complexity
of your algorithms

1107
00:55:27,350 --> 00:55:28,060
and so forth.

1108
00:55:28,060 --> 00:55:31,260
So you get into a lot of issues
of software maintenance

1109
00:55:31,260 --> 00:55:32,460
and so forth.

1110
00:55:32,460 --> 00:55:34,990
But that can be a very effective
thing, is take it

1111
00:55:34,990 --> 00:55:37,220
out of the critical path of the
actual complication you're

1112
00:55:37,220 --> 00:55:40,390
doing, if that's mission
critical, and put it in some

1113
00:55:40,390 --> 00:55:42,490
part of the computation where
it isn't mission critical.

1114
00:55:42,490 --> 00:55:43,740
Very good idea.

1115
00:55:43,740 --> 00:55:46,875

1116
00:55:46,875 --> 00:55:47,730
I'm sorry?

1117
00:55:47,730 --> 00:55:48,980
AUDIENCE: For thread safety.

1118
00:55:48,980 --> 00:55:51,890

1119
00:55:51,890 --> 00:55:56,120
Like if you have shared
reference accounts, like

1120
00:55:56,120 --> 00:55:58,600
multiple threads updating the
counts, you don't want to have

1121
00:55:58,600 --> 00:56:01,370
them all in the same count
at the same time.

1122
00:56:01,370 --> 00:56:04,160
So they fill it like a thread
local buffer, and--

1123
00:56:04,160 --> 00:56:04,670
CHARLES LEISERSON: I see
what you're saying.

1124
00:56:04,670 --> 00:56:05,230
Yes.

1125
00:56:05,230 --> 00:56:06,970
So we're going to get into
that as we deal with

1126
00:56:06,970 --> 00:56:10,780
multi-threading and so forth,
that if you've got two guys

1127
00:56:10,780 --> 00:56:12,490
that are operating on the same
data structure, you have to be

1128
00:56:12,490 --> 00:56:15,180
very careful about the
consistency, and that one

1129
00:56:15,180 --> 00:56:17,500
doesn't do something while the
other is doing something else,

1130
00:56:17,500 --> 00:56:18,310
and so forth.

1131
00:56:18,310 --> 00:56:20,390
So for that reason,
it can be a good

1132
00:56:20,390 --> 00:56:21,930
idea to postpone operation.

1133
00:56:21,930 --> 00:56:25,090

1134
00:56:25,090 --> 00:56:29,150
So general garbage collection
is based on a graph

1135
00:56:29,150 --> 00:56:29,800
abstraction.

1136
00:56:29,800 --> 00:56:33,080
So let's move into
the graph world.

1137
00:56:33,080 --> 00:56:37,040
So I can view objects and
pointers as forming a directed

1138
00:56:37,040 --> 00:56:40,480
graph G equals VE, where V are
the objects and E are the

1139
00:56:40,480 --> 00:56:42,650
pointers from one object
to another.

1140
00:56:42,650 --> 00:56:45,310

1141
00:56:45,310 --> 00:56:48,560
The live objects are reachable
from the roots.

1142
00:56:48,560 --> 00:56:53,080
And now the thing is that
finding all of the live

1143
00:56:53,080 --> 00:56:57,270
objects is then like doing
a graph search.

1144
00:56:57,270 --> 00:57:00,770
The common ways of doing a graph
search are depth for

1145
00:57:00,770 --> 00:57:03,080
search and breadth-first
search.

1146
00:57:03,080 --> 00:57:05,540
It turns out we're going to use
breadth-first search for

1147
00:57:05,540 --> 00:57:08,660
reasons that I'll show you.

1148
00:57:08,660 --> 00:57:10,740
So the idea in breadth-first
search, to

1149
00:57:10,740 --> 00:57:12,140
find all the objects.

1150
00:57:12,140 --> 00:57:14,740
So the idea is, what we're going
to do is go through and

1151
00:57:14,740 --> 00:57:18,520
find, where are all the objects
in my system that are

1152
00:57:18,520 --> 00:57:19,760
reachable from the root?

1153
00:57:19,760 --> 00:57:22,590
And then anything that
I didn't find, ah.

1154
00:57:22,590 --> 00:57:27,540
That must be garbage.

1155
00:57:27,540 --> 00:57:28,480
So here's the idea.

1156
00:57:28,480 --> 00:57:34,600
What we're going to use is FIFO
queue called Q. So that's

1157
00:57:34,600 --> 00:57:36,420
a first in, first out.

1158
00:57:36,420 --> 00:57:39,070
It has a head and a tail.

1159
00:57:39,070 --> 00:57:41,840
And the mechanism is very
much like a stack.

1160
00:57:41,840 --> 00:57:45,110
Whenever I need to enqueue
something, I put it on the

1161
00:57:45,110 --> 00:57:48,000
tail end and increment
the tail pointer.

1162
00:57:48,000 --> 00:57:50,560
Whenever I need to take
something off the queue, I

1163
00:57:50,560 --> 00:57:51,810
increment the head pointer.

1164
00:57:51,810 --> 00:57:54,290

1165
00:57:54,290 --> 00:57:59,800
So I'm always doing head and
tail, pushing on the tail and

1166
00:57:59,800 --> 00:58:01,380
pulling off the head.

1167
00:58:01,380 --> 00:58:03,010
Enqueue and dequeue.

1168
00:58:03,010 --> 00:58:07,340
So the way that this works is,
I basically go through all my

1169
00:58:07,340 --> 00:58:13,000
objects, and if they're a route,
I put a mark on it.

1170
00:58:13,000 --> 00:58:15,120
And I enqueue--

1171
00:58:15,120 --> 00:58:18,320
this is pseudocode, by the way,
so don't hold me to my C

1172
00:58:18,320 --> 00:58:20,580
programming standard here.

1173
00:58:20,580 --> 00:58:26,600
And I market, and then I enqueue
it on the queue Q. And

1174
00:58:26,600 --> 00:58:29,320
otherwise, I just simply
mark it as zero.

1175
00:58:29,320 --> 00:58:33,030
So being marked means whether
we've discovered it's

1176
00:58:33,030 --> 00:58:34,740
reachable from a route.

1177
00:58:34,740 --> 00:58:37,660
So to begin with, what we're
going to do is mark everything

1178
00:58:37,660 --> 00:58:42,000
that's reachable from the
routes, and we're going to

1179
00:58:42,000 --> 00:58:44,870
enqueue them on the FIFO.

1180
00:58:44,870 --> 00:58:46,320
And everything else, we're
going to mark as

1181
00:58:46,320 --> 00:58:48,550
unreachable, for now.

1182
00:58:48,550 --> 00:58:51,450
And now what we're going to do
is see which ones we discover.

1183
00:58:51,450 --> 00:58:56,930
So what we do is, while the
queue is non-empty, we're

1184
00:58:56,930 --> 00:59:01,030
going to take something off of
the queue and then look at all

1185
00:59:01,030 --> 00:59:02,520
of its neighbors.

1186
00:59:02,520 --> 00:59:06,890
So all the edges UV, look at all
the things that go from U

1187
00:59:06,890 --> 00:59:11,440
to V And if it's not marked,
then we're going to mark it

1188
00:59:11,440 --> 00:59:13,340
and put it on the queue.

1189
00:59:13,340 --> 00:59:15,800
That's it.

1190
00:59:15,800 --> 00:59:17,530
If it's already marked, we don't
have to do anything with

1191
00:59:17,530 --> 00:59:19,960
it, but if it's unmarked,
we'll mark it and do it.

1192
00:59:19,960 --> 00:59:23,910
And then in the end, everything
that's marked is

1193
00:59:23,910 --> 00:59:29,040
going to be something that is
live, and everything that is

1194
00:59:29,040 --> 00:59:32,110
unmarked is going to be
things that are dead.

1195
00:59:32,110 --> 00:59:35,480
But it's going to be actually
even sexier and more clever

1196
00:59:35,480 --> 00:59:37,930
than that, as you'll see.

1197
00:59:37,930 --> 00:59:40,520
So here's an example.

1198
00:59:40,520 --> 00:59:43,030
We start out with my queue being
empty, with head and

1199
00:59:43,030 --> 00:59:44,570
tail pointing to
the same place.

1200
00:59:44,570 --> 00:59:47,000
And what we do, is we go through
and we first-- in this

1201
00:59:47,000 --> 00:59:50,360
case, I have just one
route R, right here.

1202
00:59:50,360 --> 00:59:53,780
So we basically mark R
and put it on, and

1203
00:59:53,780 --> 00:59:54,790
that's how we get started.

1204
00:59:54,790 --> 00:59:57,640
And everything else is
unmarked up here.

1205
00:59:57,640 --> 00:59:59,720
So I'm going to mark with green
up here, and I'm going

1206
00:59:59,720 --> 01:00:03,380
to show what's on the queue with
the green down here, and

1207
01:00:03,380 --> 01:00:06,510
dark green when I pop
something off.

1208
01:00:06,510 --> 01:00:08,920
So the first thing I
do is I say, OK.

1209
01:00:08,920 --> 01:00:10,750
Let me dequeue something.

1210
01:00:10,750 --> 01:00:13,900
And in this case, I'm dequeueing
R, because it's the

1211
01:00:13,900 --> 01:00:15,180
only thing on there.

1212
01:00:15,180 --> 01:00:17,730
And what I do now is I visit
all the neighbors.

1213
01:00:17,730 --> 01:00:22,340
So I visit b, so I put that on,
mark it and put that on,

1214
01:00:22,340 --> 01:00:25,200
and then I visit c, so I
visit and put that on.

1215
01:00:25,200 --> 01:00:28,330

1216
01:00:28,330 --> 01:00:32,500
Then I'm done with R. I visited
all its neighbors.

1217
01:00:32,500 --> 01:00:37,390
So then I go and I dequeue
B, and now

1218
01:00:37,390 --> 01:00:38,660
visit all its neighbors.

1219
01:00:38,660 --> 01:00:41,090
Well, in [? visit ?] c, there's
nothing to be done

1220
01:00:41,090 --> 01:00:43,230
there, because c is
already marked.

1221
01:00:43,230 --> 01:00:44,520
So I don't enqueue it again.

1222
01:00:44,520 --> 01:00:47,460
I already have observed that.

1223
01:00:47,460 --> 01:00:55,500
And there's nothing else
adjacent to b, so basically, I

1224
01:00:55,500 --> 01:00:59,510
then go on and dequeue c.

1225
01:00:59,510 --> 01:01:02,380
So I have c here, and now we go
visit its neighbours, and

1226
01:01:02,380 --> 01:01:04,170
the neighbors are d and e.

1227
01:01:04,170 --> 01:01:11,980
So we enqueue d, enqueue e, and
now we are done dequeuing

1228
01:01:11,980 --> 01:01:16,120
all his neighbors and marking
them, so now we pop off d.

1229
01:01:16,120 --> 01:01:19,510
And now d has no neighbors, so
there's nothing to be done.

1230
01:01:19,510 --> 01:01:23,990
And now I dequeue e, and
basically, e has one neighbor

1231
01:01:23,990 --> 01:01:27,640
f, so we mark f.

1232
01:01:27,640 --> 01:01:34,540
Then we dequeue f, we mark g,
then we dequeue g, and g has--

1233
01:01:34,540 --> 01:01:36,660
the only neighbor is e.

1234
01:01:36,660 --> 01:01:43,330
And now my queue is empty,
and so I'm done.

1235
01:01:43,330 --> 01:01:44,690
And now you'll see,
what did I do?

1236
01:01:44,690 --> 01:01:48,010
I marked all the things that
were reachable from r, and

1237
01:01:48,010 --> 01:01:51,470
everything else is garbage.

1238
01:01:51,470 --> 01:01:54,090
Now, that's only
half the story.

1239
01:01:54,090 --> 01:01:57,720
So I've marked them, but more
relevantly, it turns out for

1240
01:01:57,720 --> 01:02:01,190
garbage collection, is look at
what I've got in my queue.

1241
01:02:01,190 --> 01:02:04,320
What do I have in my queue?

1242
01:02:04,320 --> 01:02:11,100
I've got all the live objects,
got put in the queue as we

1243
01:02:11,100 --> 01:02:13,060
were going along.

1244
01:02:13,060 --> 01:02:14,940
All the live objects are
put in the queue.

1245
01:02:14,940 --> 01:02:18,040
And not only that, but notice
they're all adjacent to each

1246
01:02:18,040 --> 01:02:19,440
other in the queue.

1247
01:02:19,440 --> 01:02:22,222
They're compact in the queue.

1248
01:02:22,222 --> 01:02:25,070

1249
01:02:25,070 --> 01:02:27,050
And so the idea is, we're going
to take advantage of

1250
01:02:27,050 --> 01:02:27,920
that property.

1251
01:02:27,920 --> 01:02:30,440
That when I do breadth-first
search, that the queue, when

1252
01:02:30,440 --> 01:02:34,630
I'm done, I put every
object in there.

1253
01:02:34,630 --> 01:02:36,430
So all the live vertices
are placed in

1254
01:02:36,430 --> 01:02:37,960
contiguous storage in queue.

1255
01:02:37,960 --> 01:02:40,580

1256
01:02:40,580 --> 01:02:43,680
So this is the principle behind
the copying garbage

1257
01:02:43,680 --> 01:02:47,780
collector, which is one of
the more popular garbage

1258
01:02:47,780 --> 01:02:49,540
collectors that's used.

1259
01:02:49,540 --> 01:02:53,860
The idea is that as I'm
executing, I'm going to have

1260
01:02:53,860 --> 01:03:01,630
some live objects and some dead
objects, and I'm going to

1261
01:03:01,630 --> 01:03:05,830
have some place that I'm doing
my next allocation at.

1262
01:03:05,830 --> 01:03:11,410
And as I go through, I allocate
more things in what's

1263
01:03:11,410 --> 01:03:15,610
called the from space, and I
keep allocating, and I may do

1264
01:03:15,610 --> 01:03:19,590
a deallocation, and maybe some
more allocation, and maybe

1265
01:03:19,590 --> 01:03:23,030
another deallocation, and some
more allocation, and

1266
01:03:23,030 --> 01:03:32,140
eventually I run out of my
memory that I've assigned for

1267
01:03:32,140 --> 01:03:34,000
my heap storage.

1268
01:03:34,000 --> 01:03:37,440
So when I get to this point, now
what I do is I'm going to

1269
01:03:37,440 --> 01:03:43,120
do breadth-first search
on the from space.

1270
01:03:43,120 --> 01:03:47,190
And what I'm going to do is I'm
going to copy using the

1271
01:03:47,190 --> 01:03:49,670
live storage here, using
[UNINTELLIGIBLE] to a TO

1272
01:03:49,670 --> 01:03:56,600
space, where the TO space
implements the FIFO queue.

1273
01:03:56,600 --> 01:03:59,540
So here's the TO space, and when
I go through and I walk

1274
01:03:59,540 --> 01:04:03,250
all that, I've now copied all
of the values, all of the

1275
01:04:03,250 --> 01:04:05,690
objects down to this
space over here.

1276
01:04:05,690 --> 01:04:08,390
And now I have this amount
of storage left.

1277
01:04:08,390 --> 01:04:10,590
I make the two space
exactly the same

1278
01:04:10,590 --> 01:04:12,130
size as the from space.

1279
01:04:12,130 --> 01:04:16,000
I have to make it at least that
big, because I know that

1280
01:04:16,000 --> 01:04:19,200
it may be that everything
here is alive.

1281
01:04:19,200 --> 01:04:20,660
Maybe nothing got deallocated.

1282
01:04:20,660 --> 01:04:23,490
So I have to make sure that the
TO space is large enough

1283
01:04:23,490 --> 01:04:26,340
to handle everything that might
be alive there, and it

1284
01:04:26,340 --> 01:04:27,960
may be everything.

1285
01:04:27,960 --> 01:04:30,700
So the TO space is generally
allocated to be exactly the

1286
01:04:30,700 --> 01:04:32,025
same size as the FROM space.

1287
01:04:32,025 --> 01:04:34,750

1288
01:04:34,750 --> 01:04:38,140
So basically, the way I
compacted is I copied things

1289
01:04:38,140 --> 01:04:40,050
down using the breadth-first
search algorithm.

1290
01:04:40,050 --> 01:04:42,700

1291
01:04:42,700 --> 01:04:46,650
Now there's one problem with
this, and that is that we have

1292
01:04:46,650 --> 01:04:50,630
pointers in these objects.

1293
01:04:50,630 --> 01:04:54,120
And when I copy it down here,
how did I make sure that all

1294
01:04:54,120 --> 01:04:57,050
the pointers now point
to the new locations?

1295
01:04:57,050 --> 01:05:00,010
So here's how you do that.

1296
01:05:00,010 --> 01:05:03,740
So since the FROM address of
the object is not generally

1297
01:05:03,740 --> 01:05:06,980
equal to the TO address,
pointers must be updated.

1298
01:05:06,980 --> 01:05:09,340
So I have to know, of course,
where the pointers are.

1299
01:05:09,340 --> 01:05:12,100
So the idea is that when an
object is copied to the TO

1300
01:05:12,100 --> 01:05:15,920
space, what we're going to do
in FROM space is store a

1301
01:05:15,920 --> 01:05:19,010
forwarding pointer in
FROM space saying,

1302
01:05:19,010 --> 01:05:20,260
I copy you to here.

1303
01:05:20,260 --> 01:05:22,800

1304
01:05:22,800 --> 01:05:25,540
So that whenever I look at
something in the FROM space

1305
01:05:25,540 --> 01:05:29,490
that's already been copied, I
can say, oh, here's where it's

1306
01:05:29,490 --> 01:05:30,805
been copied to.

1307
01:05:30,805 --> 01:05:35,060
It's been copied to this
region in the TO space.

1308
01:05:35,060 --> 01:05:38,890
When an object is removed from
the FIFO queue in the TO

1309
01:05:38,890 --> 01:05:41,685
space, we're going to update
all its pointers.

1310
01:05:41,685 --> 01:05:43,105
I'm going to give you
an example of this.

1311
01:05:43,105 --> 01:05:47,580

1312
01:05:47,580 --> 01:05:50,380
So if you think about it, when
I remove something from the

1313
01:05:50,380 --> 01:05:56,120
FIFO queue, at that point, I
know all of its adjacent

1314
01:05:56,120 --> 01:06:03,790
vertices have been placed into
the queue, into the TO space.

1315
01:06:03,790 --> 01:06:06,570
And so at that point, I know
where all the final

1316
01:06:06,570 --> 01:06:09,000
destinations of pointers are.

1317
01:06:09,000 --> 01:06:12,390
So here's an example.

1318
01:06:12,390 --> 01:06:16,930
So suppose that I'm doing the
copy of these guys, and I've

1319
01:06:16,930 --> 01:06:19,770
got all the pointers around
here, and now some of these

1320
01:06:19,770 --> 01:06:23,420
guys have already made it
into the FIFO queue.

1321
01:06:23,420 --> 01:06:25,400
Let's say this guy has made
it in, because I've got

1322
01:06:25,400 --> 01:06:28,120
forwarding pointer to there, and
this guy has made it in,

1323
01:06:28,120 --> 01:06:30,080
so I've got forwarding
pointer to there.

1324
01:06:30,080 --> 01:06:33,610
And this is what my queue
currently holds.

1325
01:06:33,610 --> 01:06:37,490
So when I remove an item from
the queue, I then look at this

1326
01:06:37,490 --> 01:06:40,960
guy, and what I do is first
of all, I visit all of the

1327
01:06:40,960 --> 01:06:42,440
adjacent neighbors.

1328
01:06:42,440 --> 01:06:45,340
So I visit this guy and I say,
oh, he's already been copied,

1329
01:06:45,340 --> 01:06:46,910
because I have a forwarding
pointer.

1330
01:06:46,910 --> 01:06:48,540
So nothing to be done there.

1331
01:06:48,540 --> 01:06:51,640
But this guy, he hasn't
been copied yet.

1332
01:06:51,640 --> 01:06:54,660
So I have to make sure that
that guy is copied by

1333
01:06:54,660 --> 01:06:56,270
enqueuing the adjacent
vertices.

1334
01:06:56,270 --> 01:07:02,220
So I copy my neighbor
to there.

1335
01:07:02,220 --> 01:07:06,350
And part of copying is, if I'm
actually making a copy, I'm

1336
01:07:06,350 --> 01:07:08,630
going to have a pointer
from here.

1337
01:07:08,630 --> 01:07:13,590
The pointer now has to point to
this guy over here, because

1338
01:07:13,590 --> 01:07:15,140
that's how I'm copying.

1339
01:07:15,140 --> 01:07:17,260
Otherwise it isn't a copy.

1340
01:07:17,260 --> 01:07:20,780
So in the copy, this pointer
is the same as that one, is

1341
01:07:20,780 --> 01:07:23,050
pointing to the same storage
location, namely

1342
01:07:23,050 --> 01:07:25,520
this place over here.

1343
01:07:25,520 --> 01:07:29,170
So I do that for all my
adjacent vertices.

1344
01:07:29,170 --> 01:07:31,660
People follow so far?

1345
01:07:31,660 --> 01:07:35,760
Now I've copied all of the
neighbors for this guy.

1346
01:07:35,760 --> 01:07:40,210
Now this guy, all of his
neighbors are in their final

1347
01:07:40,210 --> 01:07:43,070
destinations in the TO space.

1348
01:07:43,070 --> 01:07:46,520
So that means I can now update
these pointers to be the

1349
01:07:46,520 --> 01:07:48,310
pointers in the TO space.

1350
01:07:48,310 --> 01:07:51,590
So I do that as follows.

1351
01:07:51,590 --> 01:07:54,310
Oh, first of all, I forgot to
place the forwarding pointer

1352
01:07:54,310 --> 01:07:56,930
here to indicate that
he's been copied.

1353
01:07:56,930 --> 01:07:59,200
That's how I'm doing
the marks.

1354
01:07:59,200 --> 01:08:00,540
Sorry about that.

1355
01:08:00,540 --> 01:08:03,690
So I copied the values, and
that's just a straight copy,

1356
01:08:03,690 --> 01:08:05,380
because all the pointer
values are now

1357
01:08:05,380 --> 01:08:08,130
pointing into this space.

1358
01:08:08,130 --> 01:08:10,170
Then I put a forwarding
pointer in.

1359
01:08:10,170 --> 01:08:15,040
Now I'm ready to do the update
of the pointers here.

1360
01:08:15,040 --> 01:08:17,819
So I update the pointers
to the removed item.

1361
01:08:17,819 --> 01:08:19,000
So here's one pointer.

1362
01:08:19,000 --> 01:08:21,140
I get rid of him there and
I make him point to the

1363
01:08:21,140 --> 01:08:21,770
forwarding thing.

1364
01:08:21,770 --> 01:08:25,100
He used to point to here, now
he's pointing to here.

1365
01:08:25,100 --> 01:08:27,569
This one used to point to
here, I want to make

1366
01:08:27,569 --> 01:08:28,819
him point to there.

1367
01:08:28,819 --> 01:08:33,399

1368
01:08:33,399 --> 01:08:36,220
There we go.

1369
01:08:36,220 --> 01:08:37,350
So now he's pointing here.

1370
01:08:37,350 --> 01:08:40,979
And now this guy here now has
his pointers to his final

1371
01:08:40,979 --> 01:08:43,270
destinations.

1372
01:08:43,270 --> 01:08:45,899
And so the invariance that I'm
maintaining is that everything

1373
01:08:45,899 --> 01:08:52,630
before the head of the queue all
have their final pointers

1374
01:08:52,630 --> 01:08:54,890
in place in the TO space.

1375
01:08:54,890 --> 01:08:56,359
Question?

1376
01:08:56,359 --> 01:08:59,119
AUDIENCE: So do you know where
to put the pointers--

1377
01:08:59,119 --> 01:09:02,734
how do [UNINTELLIGIBLE] pointers
go from the TO space

1378
01:09:02,734 --> 01:09:04,883
to the TO space so that they're
actually getting to

1379
01:09:04,883 --> 01:09:05,545
their final destination?

1380
01:09:05,545 --> 01:09:08,025
So my question is,
[UNINTELLIGIBLE] by following

1381
01:09:08,025 --> 01:09:09,275
[UNINTELLIGIBLE].

1382
01:09:09,275 --> 01:09:13,000

1383
01:09:13,000 --> 01:09:15,699
But how do you know what
pointer to follow?

1384
01:09:15,699 --> 01:09:17,539
How do you know that
[INAUDIBLE]?

1385
01:09:17,539 --> 01:09:19,380
CHARLES LEISERSON: If I have a
pointer to something in the

1386
01:09:19,380 --> 01:09:23,870
FROM space, then I update it to
be the forwarding pointer

1387
01:09:23,870 --> 01:09:25,196
in the TO space.

1388
01:09:25,196 --> 01:09:27,015
AUDIENCE: Yeah, the question
is, how do you know that

1389
01:09:27,015 --> 01:09:27,729
[INAUDIBLE]

1390
01:09:27,729 --> 01:09:28,859
it's being pointed
to [? the ?]

1391
01:09:28,859 --> 01:09:31,607
actually in the TO space?

1392
01:09:31,607 --> 01:09:33,439
Can you check, is this
in the TO space?

1393
01:09:33,439 --> 01:09:33,939
CHARLES LEISERSON: Well, yeah.

1394
01:09:33,939 --> 01:09:35,990
I mean, you know what the range
of the two spaces is, so

1395
01:09:35,990 --> 01:09:37,240
you could check that.

1396
01:09:37,240 --> 01:09:39,330

1397
01:09:39,330 --> 01:09:39,510
OK.

1398
01:09:39,510 --> 01:09:40,140
Question?

1399
01:09:40,140 --> 01:09:42,322
AUDIENCE: So the reason you
would be using [? BFS ?]

1400
01:09:42,322 --> 01:09:44,505
instead of [? DFS ?] is
because [INAUDIBLE].

1401
01:09:44,505 --> 01:09:47,410

1402
01:09:47,410 --> 01:09:49,189
CHARLES LEISERSON: The reason
you're using BFS is because it

1403
01:09:49,189 --> 01:09:51,910
has this nice property that
the queue represents the

1404
01:09:51,910 --> 01:09:53,390
copying of all the values.

1405
01:09:53,390 --> 01:09:56,600
If I use DFS, I need a stack
to do it, and then I'm

1406
01:09:56,600 --> 01:10:01,710
overwriting the locations where
the vertices would be

1407
01:10:01,710 --> 01:10:04,280
stored in the data structure.

1408
01:10:04,280 --> 01:10:08,240
By using DFS, there's this just
clever thing that the

1409
01:10:08,240 --> 01:10:11,010
queue, when I'm done with
DFS, represents all

1410
01:10:11,010 --> 01:10:13,550
of the visited vertices.

1411
01:10:13,550 --> 01:10:15,620
It's very clever.

1412
01:10:15,620 --> 01:10:17,230
Wish I'd thought of that.

1413
01:10:17,230 --> 01:10:19,520
But I was, I think, only 10
years old or something when

1414
01:10:19,520 --> 01:10:23,475
they invented this, and I
wasn't that precocious.

1415
01:10:23,475 --> 01:10:26,390

1416
01:10:26,390 --> 01:10:28,570
OK?

1417
01:10:28,570 --> 01:10:30,560
So very, very clever.

1418
01:10:30,560 --> 01:10:35,340
So this is the scheme behind
the Minsky, Fenichel,

1419
01:10:35,340 --> 01:10:37,120
Yochelson garbage collector.

1420
01:10:37,120 --> 01:10:40,160
So Marvin Minsky, you may have
heard of, was one of the

1421
01:10:40,160 --> 01:10:44,020
inventors of a scheme
like this.

1422
01:10:44,020 --> 01:10:45,270
One reason he's so famous.

1423
01:10:45,270 --> 01:10:48,020

1424
01:10:48,020 --> 01:10:49,020
Good.

1425
01:10:49,020 --> 01:10:51,670
So this is really nice, because
it takes me linear

1426
01:10:51,670 --> 01:10:55,990
time to copy and update
all of the vertices.

1427
01:10:55,990 --> 01:10:57,800
It's just one pass of
breadth-first search.

1428
01:10:57,800 --> 01:11:00,990
Very fast to copy and update
all the vertices.

1429
01:11:00,990 --> 01:11:02,840
Sometimes, by the way, people
call these forwarding

1430
01:11:02,840 --> 01:11:04,760
pointers, they call them
broken hearts.

1431
01:11:04,760 --> 01:11:06,220
You'll see that in
the literature.

1432
01:11:06,220 --> 01:11:09,400
They call them a broken
heart if it's--

1433
01:11:09,400 --> 01:11:13,750
I don't quite remember
why that's the--

1434
01:11:13,750 --> 01:11:15,580
it's like, I went there,
oh, you're not

1435
01:11:15,580 --> 01:11:16,800
there, it's over there--

1436
01:11:16,800 --> 01:11:18,050
it's like, I don't know.

1437
01:11:18,050 --> 01:11:20,400

1438
01:11:20,400 --> 01:11:23,980
Now, the last thing you have to
do-- and this is something

1439
01:11:23,980 --> 01:11:27,370
that's not talked about a
lot in the literature--

1440
01:11:27,370 --> 01:11:29,460
is how do you manage
these from and to

1441
01:11:29,460 --> 01:11:33,830
spaces in virtual memory?

1442
01:11:33,830 --> 01:11:37,440
So here's the strategy
that works well.

1443
01:11:37,440 --> 01:11:41,660
So basically, typically what I'm
doing is after I've done a

1444
01:11:41,660 --> 01:11:45,130
copy, I've emptied everything
out of the FROM space, and

1445
01:11:45,130 --> 01:11:49,590
I've got a portion that's used
in the TO space, and then an

1446
01:11:49,590 --> 01:11:52,500
unused portion that I'm going
to be able to do allocations

1447
01:11:52,500 --> 01:11:55,900
out of for the next
pass around.

1448
01:11:55,900 --> 01:11:59,580
So what I do at that point
is I make the old TO

1449
01:11:59,580 --> 01:12:01,260
become the new FROM.

1450
01:12:01,260 --> 01:12:02,980
So here's the new FROM.

1451
01:12:02,980 --> 01:12:04,740
This is available space.

1452
01:12:04,740 --> 01:12:08,940
And now what I do is I extend
this space, if necessary, to

1453
01:12:08,940 --> 01:12:12,230
make the new heap space that I'm
going to allocate out of

1454
01:12:12,230 --> 01:12:16,130
equal in size to
the used space.

1455
01:12:16,130 --> 01:12:18,980
And the reason for doing this
is to amortize the cost of

1456
01:12:18,980 --> 01:12:22,640
going through all of the
storage, so that the cost of

1457
01:12:22,640 --> 01:12:25,660
going the garbage question
can be amortized

1458
01:12:25,660 --> 01:12:26,680
against what's used.

1459
01:12:26,680 --> 01:12:29,230
If I didn't extend this, I might
have only freed up a

1460
01:12:29,230 --> 01:12:33,170
tiny little bit of space, and
now when I run out of that

1461
01:12:33,170 --> 01:12:35,660
space, I'll run the garbage
collector going over all the

1462
01:12:35,660 --> 01:12:39,630
elements, but I'll only have
gotten a few elements

1463
01:12:39,630 --> 01:12:39,800
[INAUDIBLE].

1464
01:12:39,800 --> 01:12:42,070
So what I want to do is make
sure that I've done at least a

1465
01:12:42,070 --> 01:12:45,300
certain number of allocations
proportional to the number

1466
01:12:45,300 --> 01:12:48,090
that I'm going to have to walk
over in order to finish it.

1467
01:12:48,090 --> 01:12:50,630
So that's why you make this
be the same size.

1468
01:12:50,630 --> 01:12:54,110
Now when the space runs out, the
new TO is allocated with

1469
01:12:54,110 --> 01:12:57,580
the same size as FROM.

1470
01:12:57,580 --> 01:13:03,360
So if you can allocate it back
in the old FROM space here,

1471
01:13:03,360 --> 01:13:04,720
that's great.

1472
01:13:04,720 --> 01:13:05,950
If there's room here
to allocate it

1473
01:13:05,950 --> 01:13:07,340
at the front, super.

1474
01:13:07,340 --> 01:13:08,600
That's where you allocate it.

1475
01:13:08,600 --> 01:13:12,500
But as in this example, the FROM
space here is bigger than

1476
01:13:12,500 --> 01:13:14,770
the space that goes
up to here.

1477
01:13:14,770 --> 01:13:19,560
So therefore what I do is I
allocate it at the end, and

1478
01:13:19,560 --> 01:13:22,600
then I go through and
copy to here.

1479
01:13:22,600 --> 01:13:25,620
Now if this ended up being
used, then the next time

1480
01:13:25,620 --> 01:13:30,400
around, I would be able to reuse
this space at the front

1481
01:13:30,400 --> 01:13:33,440
for the next time that I flip
the garbage collection.

1482
01:13:33,440 --> 01:13:38,220
The idea is, if things will
fit as early in memory as

1483
01:13:38,220 --> 01:13:39,470
possible, that's where
I put them.

1484
01:13:39,470 --> 01:13:41,760
But if they don't fit,
I put them later.

1485
01:13:41,760 --> 01:13:44,420
And you can prove, if you do
this, that the virtual memory

1486
01:13:44,420 --> 01:13:48,380
space used is order 1 times
the optimal, where you're

1487
01:13:48,380 --> 01:13:51,330
getting nearly the best virtual
memory space used

1488
01:13:51,330 --> 01:13:56,050
compared to the amount of items
that are in use by the

1489
01:13:56,050 --> 01:13:58,460
user at any time.

1490
01:13:58,460 --> 01:14:00,880
So it keeps everything sort
of compact towards

1491
01:14:00,880 --> 01:14:02,310
the front end here.

1492
01:14:02,310 --> 01:14:04,360
So in this case, I had
to allocate to here.

1493
01:14:04,360 --> 01:14:07,410
If the use had been very small,
I could have allocated

1494
01:14:07,410 --> 01:14:09,630
just a very small amount
for the heap.

1495
01:14:09,630 --> 01:14:12,490
When I was done, the
new TO space would

1496
01:14:12,490 --> 01:14:13,550
have fit at the beginning.

1497
01:14:13,550 --> 01:14:16,380
I would have allocated
it at the beginning.

1498
01:14:16,380 --> 01:14:19,550
And it's a nice little exercise,
for those of you who

1499
01:14:19,550 --> 01:14:23,050
are analytically inclined, to
figure out what the constant

1500
01:14:23,050 --> 01:14:27,570
is here, because it's a little
bit of a game to figure out

1501
01:14:27,570 --> 01:14:30,730
what the amortized constant,
for those who have some

1502
01:14:30,730 --> 01:14:37,700
algorithmic energy and our
confident about working in

1503
01:14:37,700 --> 01:14:39,365
this kind of thing instead
of studying for the quiz.

1504
01:14:39,365 --> 01:14:43,460

1505
01:14:43,460 --> 01:14:45,880
So dynamic storage allocation.

1506
01:14:45,880 --> 01:14:48,960
Hugely interesting area.

1507
01:14:48,960 --> 01:14:50,930
Lots is known and
lots is unknown.

1508
01:14:50,930 --> 01:14:52,430
For example, I mentioned
coalescing.

1509
01:14:52,430 --> 01:14:54,460
We really don't understand
coalescing.

1510
01:14:54,460 --> 01:14:57,680
Disaster people have been doing
it for 50 years or more.

1511
01:14:57,680 --> 01:15:01,490
We do not understand
coalescing.

1512
01:15:01,490 --> 01:15:04,190
There are a lot of strategies
that people have studied,

1513
01:15:04,190 --> 01:15:06,300
things like the buddy
system, I mentioned.

1514
01:15:06,300 --> 01:15:09,050
There are other types of garbage
collectors called mark

1515
01:15:09,050 --> 01:15:10,820
and sweep garbage collection.

1516
01:15:10,820 --> 01:15:13,500
Those are typically inferior
to the copying garbage

1517
01:15:13,500 --> 01:15:19,070
collection, because they
leave you with a

1518
01:15:19,070 --> 01:15:21,980
memory that isn't compact.

1519
01:15:21,980 --> 01:15:24,390
And so you end up using more
and more of your virtual

1520
01:15:24,390 --> 01:15:27,560
memory, because you're leaving
things in place.

1521
01:15:27,560 --> 01:15:29,840
On the other hand, you leave
things in place so there's no

1522
01:15:29,840 --> 01:15:32,100
need to move pointers.

1523
01:15:32,100 --> 01:15:33,970
So that can be a
valuable thing.

1524
01:15:33,970 --> 01:15:35,620
There's things called
generational garbage

1525
01:15:35,620 --> 01:15:36,650
collection.

1526
01:15:36,650 --> 01:15:39,660
Generational garbage collectors
realize that things

1527
01:15:39,660 --> 01:15:43,930
that are allocated typically
are freed very soon

1528
01:15:43,930 --> 01:15:48,660
thereafter, and so therefore, if
you especially take care of

1529
01:15:48,660 --> 01:15:51,080
the things that are allocated
and guess that they're going

1530
01:15:51,080 --> 01:15:55,640
to be used again, and that
things that haven't been

1531
01:15:55,640 --> 01:15:58,980
deallocated for a long time
are very unlikely to be

1532
01:15:58,980 --> 01:16:02,540
allocated, you can get better
performance out

1533
01:16:02,540 --> 01:16:04,580
of a garbage collector.

1534
01:16:04,580 --> 01:16:06,730
There's real time garbage
collection.

1535
01:16:06,730 --> 01:16:08,810
So this is what I've explained,
is what's called a

1536
01:16:08,810 --> 01:16:11,500
stop and copy garbage
collector.

1537
01:16:11,500 --> 01:16:13,560
Meaning that when you run
out of storage, oops!

1538
01:16:13,560 --> 01:16:15,430
Stop the world, run the
garbage collector.

1539
01:16:15,430 --> 01:16:17,880
There are real time garbage
collectors which, whenever you

1540
01:16:17,880 --> 01:16:21,980
do an allocation, it goes and
it does a few cycles of

1541
01:16:21,980 --> 01:16:26,680
garbage collection, and is such
that you can always keep

1542
01:16:26,680 --> 01:16:30,060
up with the allocations and
deallocations going on.

1543
01:16:30,060 --> 01:16:34,070
By doing a few cycles, you
eventually get to garbage

1544
01:16:34,070 --> 01:16:37,960
collect everything, but you do
it intermixed, so that you're

1545
01:16:37,960 --> 01:16:40,140
slowing everything down by a
constant amount, rather than

1546
01:16:40,140 --> 01:16:42,990
running and suddenly, OK, your
robot arm is going like this,

1547
01:16:42,990 --> 01:16:45,276
oops, garbage collect, garbage
collect, garbage collect,

1548
01:16:45,276 --> 01:16:46,850
[CREAKING SOUND].

1549
01:16:46,850 --> 01:16:50,160
So you can have your robot arm
move smoothly, because it's

1550
01:16:50,160 --> 01:16:52,080
paying the price all the way.

1551
01:16:52,080 --> 01:16:54,430
It's not stopping and going.

1552
01:16:54,430 --> 01:16:57,570
There is multi-threaded
storage allocation.

1553
01:16:57,570 --> 01:17:04,390
What happens when you have a
multi-threaded environment

1554
01:17:04,390 --> 01:17:07,240
where you have many processors
that are all allocating out of

1555
01:17:07,240 --> 01:17:08,040
the same pool?

1556
01:17:08,040 --> 01:17:09,790
How do they organize it?

1557
01:17:09,790 --> 01:17:11,920
You can have the interesting
things going there where

1558
01:17:11,920 --> 01:17:15,990
things are allocated on one
processor, and then get

1559
01:17:15,990 --> 01:17:17,800
deallocated on another.

1560
01:17:17,800 --> 01:17:20,050
And if you're not careful, you
have something like a memory

1561
01:17:20,050 --> 01:17:24,420
leak going on, which is called
memory drift, where the memory

1562
01:17:24,420 --> 01:17:27,660
gets allocated on one, and one
guy's busy creating it, and it

1563
01:17:27,660 --> 01:17:30,360
keeps getting deallocated on
the other, but never gets

1564
01:17:30,360 --> 01:17:33,210
recycled back to the
guy who needs it.

1565
01:17:33,210 --> 01:17:36,280
And so how is it that you figure
out to make sure that

1566
01:17:36,280 --> 01:17:39,590
memory drift doesn't end up
having you grow your memory?

1567
01:17:39,590 --> 01:17:42,580
In fact, many, many garbage
collectors that are out there

1568
01:17:42,580 --> 01:17:45,910
have this memory drift problem
for multi-threaded storage

1569
01:17:45,910 --> 01:17:46,710
allocation.

1570
01:17:46,710 --> 01:17:48,740
There's also parallel
garbage collection.

1571
01:17:48,740 --> 01:17:51,480
How do you, in a parallel
environment, have the garbage

1572
01:17:51,480 --> 01:17:54,470
collector going on as a separate
process while you

1573
01:17:54,470 --> 01:17:56,670
have other things going
on at the same time?

1574
01:17:56,670 --> 01:17:59,960
So a huge, huge, huge
area of storage.

1575
01:17:59,960 --> 01:18:05,240
And for our next lab, lab 3,
you'll be building your own

1576
01:18:05,240 --> 01:18:06,100
storage allocator.

1577
01:18:06,100 --> 01:18:08,880
So you have a lot of
fun with this.

1578
01:18:08,880 --> 01:18:10,580
[? Reid ?] says this is
his favorite lab.

1579
01:18:10,580 --> 01:18:13,560

1580
01:18:13,560 --> 01:18:13,820
Great.

1581
01:18:13,820 --> 01:18:15,070
Any questions?

1582
01:18:15,070 --> 01:18:17,560

1583
01:18:17,560 --> 01:18:17,910
OK.

1584
01:18:17,910 --> 01:18:19,300
Good luck on the
quiz next time.

1585
01:18:19,300 --> 01:18:20,885